ECU ODS data integration using OWB and SSIS UNC Cause 2013

36
UNC CAUSE 2013 Integrating Oracle and non-Oracle External Data into the Ellucian/Banner ODS using Oracle Warehouse Builder (OWB) and Microsoft SQL Server Integration Services (SSIS) East Carolina University Enterprise Analytics Ruben Villasmil - [email protected] Keith Washer - [email protected]

description

ECU’s Extract Transform and Load (ETL) Framework consists of two paths for loading external data into the Operational Data Store (ODS): Non-Oracle Data Sources (Microsoft SQL Server, MS Access databases, web services) and Oracle data sources. The paths are controlled by the external system and the mechanism to connect and extract the data. When the external system does not allow for an Oracle to Oracle connection, Microsoft SQL Server Integration Services (SSIS) is used as the foundation for the Non-Oracle data source path. When the external systems allows for an Oracle to Oracle connection the Oracle Data Source path is selected. In this session we will present several major projects showcasing how ECU leverages Microsoft SQL Server Integration Services (SSIS), Oracle Streams, and the Ellucian/Banner ODS ETL process to load various types of external data into the Ellucian/Banner Operational Data Store (ODS).

Transcript of ECU ODS data integration using OWB and SSIS UNC Cause 2013

Page 1: ECU ODS data integration using OWB and SSIS UNC Cause 2013

UNC CAUSE 2013Integrating Oracle and non-Oracle External Data into the Ellucian/Banner ODS using Oracle Warehouse Builder (OWB) and Microsoft SQL Server Integration Services

(SSIS)

East Carolina University Enterprise AnalyticsRuben Villasmil - [email protected] Washer - [email protected]

Page 2: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Integrating Oracle and non-Oracle External Data into ODS

Why this session?

The need for unified reporting by East Carolina University which utilizes numerous Information Systems to accomplish its mission.

In this session we will present several projects showcasing how ECU leverages the Ellucian/Banner ETL methodology, ORACLE Streams, OWB and Microsoft SQL Server Integration Services (SSIS) to load various types of external data into the ECU Operational Data Store (ODS)

 

Page 3: ECU ODS data integration using OWB and SSIS UNC Cause 2013

ECU Operational Data Store (ODS) – Hosted Systems

Page 4: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Integrating Oracle and non-Oracle External Data into ODS

Page 5: ECU ODS data integration using OWB and SSIS UNC Cause 2013

ECU's Extract Transform and Load (ETL) Framework consists of two paths for loading external data into the Ellucian/Banner Operational Data Store (ODS):

• Oracle Data Source Path: 1. Data resides in Oracle 2. ECU DBAs manage the data sources (i.e., BlackBoard, DegreeWorks). 3. Tools and methods:

• Ellucian’s ODS ETL process • Oracle Warehouse Builder. Where necessary SQL , and PL/SQL scripts

4. Rationale• The infrastructure for this path was already in place as result of the ODS implementation. No additional tools/cost were

required/incurred for this path.

• Non-Oracle and Oracle(non-managed) Data Source Path: 1. Data resides in non-Oracle systems such as Microsoft SQL Server, MS Access databases, web services, or flat files. 2. Data resides in external Oracle system and ECU DBAs do not manage the data source. 3. Tools and methods:

• Microsoft SQL Server Integration Services(SSIS) with best practices and standards. 4. Rationale

• In-house expertise with Microsoft’s Business Intelligence Stack(Reporting Services, Integration Services, Analysis Services) architecture and products: MS SQL Server, BIDS/Visual Studio, Share Point.

• Tool provides native connectivity components to heterogeneous systems.

Integrating Oracle and non-Oracle External Data into ODS

Page 6: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Integrating Oracle and non-Oracle External Data into ODS (Continue)

 

Oracle Data Source Path Projects include: 

• BlackBoard : From 1 billion records to 20 million+ records for reporting. Oracle streams is used to pre-filter what data is replicated, and summary Object Access views are used to load the final tables for reporting.

• Sciquest XML data extracts: Oracle Streams is used to replicate the "clob" containing the Sciquest XML message. Then Oracle xml syntax is used in the ETL process to parse the xml message and load Requisition and Purchasing data into ODS.

• DegreeWorks (DWs): Oracle Streams is used to replicate the necessary DWs tables. Then the standard ETL process is used to load DWs data to ODS. Data is presented to the users via 38 CPA reporting views.

Non Oracle and Oracle(non-managed) Data Source Path Projects include:

• SSIS Infrastructure: ECU SSIS Package automation tool, ETL Package Logging, ETL Package Execution Reporting

• RAMSeS (Research Administration Management System & eSubmission):  a comprehensive web-based Electronic Research Administration (eRA) system to manage research more efficiently and effectively. Ramses includes several years of proposal and award data.  

Page 7: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Oracle Data Source Path

Page 8: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Oracle Data Source Path: Ellucian’s ETL Methodology

Page 9: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Oracle Data Source Path: ECU’s ETL Process

Page 10: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Oracle Data Source Path: ECU ETL OBJECT SUMMARY

ECU CustomETL’S and

Reporting Views Footprint

are Approaching

Ellucian’s

ECU ETL FootprintLoad Groups: Str. Tables ECU ECUODS Total ECU and ECUODS 50* 140 39 179Census Day Freeze 14 14Sciquest 1 21 21BlackBoard(BB)** 13 4 4Degreeworks (DW) 22 22 22

240* In addition to Ellucian - delivered streamed tables** Currently developing new ETL process to freeze BB student grade book data. Process willadd 7 additional OWB maps

Other Objects related to ECU related ETLsReporting views 452

ODSMGR OWB ETL Streamed TablesLoad Maps 252 Schema (PBAN) stage (ODS)Update Maps 225 FAISMGR 403 94 Delete Maps 226 FIMSMGR 571 206

GENERAL 358 111 ODSMGR Reporting views 505 ONESTOP * 297 36

PAYROLL 485 168 PORTAL2 * 144 8 POSNCTL 146 31 SATURN 1,230 436 TAISMGR 158 49

3,792 1,139

* ECU developed Schemas

SCHEMA OWB Maps

ELLUCIANS ODS FOOTPRINT

Page 11: ECU ODS data integration using OWB and SSIS UNC Cause 2013

•System: BlackBoard Learning Management System.

•Requirements: Identify tool/application utilization per Academic Period.

•Tools: Oracle, BB Data dictionary. UMBC BB Project (http://www.umbc.edu/oit/newmedia/blackboard/stats/ )

•Challenges:• Managing 1 billion records for reporting (# of records to determine utilization by College, student profile, and

course attributes). • Identifying application paths for summary data.• Joining BB data with ODS course data.

•Project summary: Developed 4 OWB maps to extract data from streamed tables. Developed 7 reporting views and 7 BIDs reports.

ORACLE Data Path: BlackBoard

Page 12: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Solving the Challenge:

•Developed Summary Composite views grouped by month, course, user and application (Reducing the data for reporting from 1 billion to 20 million records).

•Identified additional application paths by extracting the information from the activity accumulator “DATA” column.

•Mapped BB users by banner ID to ODS. Use ODS person to get Student Demographics.

•Mapped BB course identifiers to ODS by parsing the Course batchID: SUBSTR (C.batch_uid, 7) bb_academic_period, SUBSTR (C.batch_uid, 1, 5) bb_crn where INSTR (batch_uid, '.') = 6.

•Created OWB map to extract ODS Academic Study data into a separate table for Performance Improvement.

ORACLE Data Path: BlackBoard

Page 13: ECU ODS data integration using OWB and SSIS UNC Cause 2013

ORACLE Data Path: BlackBoard

Page 14: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Composite/Object Access to summarize the activity accumulator data

ORACLE Data Path: Black Board

Page 15: ECU ODS data integration using OWB and SSIS UNC Cause 2013

ORACLE Data Path: Black BoardSample Reports

Page 16: ECU ODS data integration using OWB and SSIS UNC Cause 2013

•System: SCIQUEST Requisition (PR) and Purchasing (PO) system (Third party vendor).

•Requirements: Extract PR and PO data from xml message delivered nightly by Sciquest.

•Tools: ORACLE , XMLSPY

•Challenges: • Security: Validating the “original” xml required ORACLE go over the internet to Sciquest.• Performance: Oracle XDB parsing/validating was impacting the production database other processes.• Performance: extracting XML data via views for reporting is sluggish (messages 10 Mbytes+) • Oracle issues pivoting data extracted in XML via relational views.• Handling PO/PR updates. Sciquest xml message contains the latest PO/PR information.

•Project summary: Developed 21 OWB maps to extract xml data from 1 streamed table. Developed 16 PO reporting views, and 13 PR reporting views. User is developing report solution in BIDS.

Oracle Data Source Path: Sciquest

Page 17: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Solving the Challenge:

•Security: Removed Sciquest Schema reference from XML message. REGEXP_REPLACE ( SUBSTR (HTTP_CONTENT, 1, 1000), '(.*?)(<!DOCTYPE.*?">)(.*)|

(xmlns="http://solutions.*?xsd")(.*)', '\1\3')

•Performance validating xml: Streamed xmlreceipt table to ODS. Leverage oracle 11G xmltype/clob which allows “extract xml functions” without validating the entire content. (User assumes xml is valid). No need for XDB to parse the xml.

•Performance querying xmltype: Changed approach for reporting. Data is extracted nightly and appended to composite tables. Reporting views are based on composite tables. No xml extracts in reporting views.

•PO/PR Updates: Created OWB map to track transactions loaded. Created delete OWB maps for PO and PR composite tables. PO and PR OWB maps are set to insert only.

Oracle Data Source Path: Sciquest

Page 18: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Oracle Data Source Path: Sciquest

OWB maps registered with IA Admin

Page 19: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Oracle Data Source Path: Sciquest

XMLSpy Walking the path

Page 20: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Oracle Data Source Path: Sciquest

Used in first OWB map to load the latest xml message Sample xml extract used in other OWB maps(18 composite views use this method)

Page 21: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Oracle Data Source Path: Sciquest

Delete map – Deletes same PO/PR ID if it exist in the Composite tables.

Sample OWB Map:LOAD_EFT_PO_HDR_CUST_FIELDS

Page 22: ECU ODS data integration using OWB and SSIS UNC Cause 2013

•System: Ellucian DegreeWorks. Curriculumn and Planning Tool for Student and advisors

•Requirements: Provide access to DegreeWorks Audit and Planning data to the registrar Office and Advisors via ODS

•Tools: ORACLE. DW CPA reporting guide. DW Sample reports (earlier versions)

•Challenges:

• Data structures in DW (CHAR vs. VARCHAR).

•Project summary: Developed 22 OWB maps. Developed 38 DW reporting views. During development

identified issues with Audit Data, Ellucian provided updated software to correct the data issue. Currently

working the users to develop report solution.

Oracle Data Source Path: DegreeWorks

Page 23: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Solving the Challenge:

•Data structures :Developed script to leverage the data dictionary information and automate the DDL creation for composite views. Composite views cast and trim source tables column based on column data type:

CAST (TRIM (DAP_STU_ID) AS VARCHAR2 (10)) DAP_STU_ID, CAST (TRIM (DAP_SCHOOL) AS VARCHAR2 (12)) DAP_SCHOOL,

ORACLE Data Path:DegreeWorks

Page 24: ECU ODS data integration using OWB and SSIS UNC Cause 2013

ORACLE Data Path: DegreeWorks

Page 25: ECU ODS data integration using OWB and SSIS UNC Cause 2013

ORACLE Data Path: DegreeWorks

OWB maps registered with IA Admin

Page 26: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Non Oracle and Oracle(non-managed) Data Source Path

Page 27: ECU ODS data integration using OWB and SSIS UNC Cause 2013

High Level Overview:ECU ETL SSIS ARCHITECTURE

Integration Services

ODS

FLAT FILES

SQL Agent Job SSIS Package Store SSIS Logging

RDBMS

SOURCE DESTINATION

Web Services

Page 28: ECU ODS data integration using OWB and SSIS UNC Cause 2013

ECU ETL SSIS PROCESS for Non-Oracle Data Sources

Page 29: ECU ODS data integration using OWB and SSIS UNC Cause 2013

SSIS -Integration Services SolutionBusiness Intelligence Development Studio(BIDS)

Page 30: ECU ODS data integration using OWB and SSIS UNC Cause 2013

• Source System: Microsoft SQL Server (Hosted at UNC- Chapel Hill)• Application: RAMSeS• Objects analyzed: 125 Tables + 23 Views = 148 Source Objects• Tools used: Microsoft Business Intelligence Development Studio, C# and .NET, SQL Server Integration Services,

ECU SSIS Package Automation Tool, TOAD, SQL Scripts• Challenges:

• Previously developed ETL Packages took several days to several weeks to complete with only 6- 12 source objects. • Inconsistency when implementing SSIS Package naming/ETL standards combined with standard SSIS design• Previously developed SSIS Packages had no robust logging or Package Execution Reporting.

• Project summary: Microsoft SQL Server Integration Services and ECU’s SSIS Package automation tool are used to create an ETL

package to extract and load data from the Ramses database into the Ellucian/Banner ODS. Integration of the Ramses data into the ODS allows the department of Institutional Research to compile an annual report in under 4 hours which previously required a full 8 hours. Creating initial SSIS Packages has been reduced to under 5 minutes using ECU SSIS Automation Tool. SSIS logging within the package is used to track execution errors, warnings,

package duration. Existing BI Stack-Reporting Services used to host a SSIS Package Execution Summary Report for daily monitoring of SSIS ETL Package Executions.

SSIS Data Path: RAMSeS

Page 31: ECU ODS data integration using OWB and SSIS UNC Cause 2013

SSIS Data Path: RAMSeS

Solving the Challenge:

•Implemented Staging and Target Schemas in the ODS•Utilized SSIS Import/Export Wizard to quickly generate Ramses stage and target Destination tables•Developed the ECU SSIS Package Automation Tool(Script Task, C# .NET) , Integrated ECU’s Methodology and existing ETL Standards to efficiently build a standardized ETL Package.•Developed an Object Mapping table – to support validation •Leveraged existing SSIS Logging features to be configured automatically within the automation tool during package creation •Leverage existing Reporting Services Instance to host a SSIS Package Execution Summary Dashboard developed in BIDS for daily monitoring of Package/ETL Job execution.

Page 32: ECU ODS data integration using OWB and SSIS UNC Cause 2013

ECU ETL SSIS ARCHITECTURE for External Data Sources

6

Integration Services

ODS

FLAT FILES

ETL Job Execution Reporting

SQL Agent Job SSIS Package Store SSIS Logging

RDBMS

SOURCE DESTINATION

5

Web Services

1. SSIS Import Export Wizard (Creates Destination Tables in ODS)

4. Deploy Package to SSIS ETL Server

2. Create Object Mapping Table

3. Generate SSIS ETL Package Encoded with design/ETL Standards

Page 33: ECU ODS data integration using OWB and SSIS UNC Cause 2013

SQL Server Import Export Wizard: Creating Destination Tables in the ECUBIC SSIS Staging Schema(ODS)

Page 34: ECU ODS data integration using OWB and SSIS UNC Cause 2013

Creation of the Object Mapping Table: RAM_OBJECT_MAPPING

Page 35: ECU ODS data integration using OWB and SSIS UNC Cause 2013

ECU SSIS Package Automation Tool: Building the RAMSES ETL Package

Page 36: ECU ODS data integration using OWB and SSIS UNC Cause 2013

SSIS ETL Package Execution Reporting