ECU ODS data integration using OWB and SSIS UNC Cause 2013
-
Upload
keith-washer -
Category
Education
-
view
511 -
download
1
description
Transcript of ECU ODS data integration using OWB and SSIS UNC Cause 2013
UNC CAUSE 2013Integrating Oracle and non-Oracle External Data into the Ellucian/Banner ODS using Oracle Warehouse Builder (OWB) and Microsoft SQL Server Integration Services
(SSIS)
East Carolina University Enterprise AnalyticsRuben Villasmil - [email protected] Washer - [email protected]
Integrating Oracle and non-Oracle External Data into ODS
Why this session?
The need for unified reporting by East Carolina University which utilizes numerous Information Systems to accomplish its mission.
In this session we will present several projects showcasing how ECU leverages the Ellucian/Banner ETL methodology, ORACLE Streams, OWB and Microsoft SQL Server Integration Services (SSIS) to load various types of external data into the ECU Operational Data Store (ODS)
ECU Operational Data Store (ODS) – Hosted Systems
Integrating Oracle and non-Oracle External Data into ODS
ECU's Extract Transform and Load (ETL) Framework consists of two paths for loading external data into the Ellucian/Banner Operational Data Store (ODS):
• Oracle Data Source Path: 1. Data resides in Oracle 2. ECU DBAs manage the data sources (i.e., BlackBoard, DegreeWorks). 3. Tools and methods:
• Ellucian’s ODS ETL process • Oracle Warehouse Builder. Where necessary SQL , and PL/SQL scripts
4. Rationale• The infrastructure for this path was already in place as result of the ODS implementation. No additional tools/cost were
required/incurred for this path.
• Non-Oracle and Oracle(non-managed) Data Source Path: 1. Data resides in non-Oracle systems such as Microsoft SQL Server, MS Access databases, web services, or flat files. 2. Data resides in external Oracle system and ECU DBAs do not manage the data source. 3. Tools and methods:
• Microsoft SQL Server Integration Services(SSIS) with best practices and standards. 4. Rationale
• In-house expertise with Microsoft’s Business Intelligence Stack(Reporting Services, Integration Services, Analysis Services) architecture and products: MS SQL Server, BIDS/Visual Studio, Share Point.
• Tool provides native connectivity components to heterogeneous systems.
Integrating Oracle and non-Oracle External Data into ODS
Integrating Oracle and non-Oracle External Data into ODS (Continue)
Oracle Data Source Path Projects include:
• BlackBoard : From 1 billion records to 20 million+ records for reporting. Oracle streams is used to pre-filter what data is replicated, and summary Object Access views are used to load the final tables for reporting.
• Sciquest XML data extracts: Oracle Streams is used to replicate the "clob" containing the Sciquest XML message. Then Oracle xml syntax is used in the ETL process to parse the xml message and load Requisition and Purchasing data into ODS.
• DegreeWorks (DWs): Oracle Streams is used to replicate the necessary DWs tables. Then the standard ETL process is used to load DWs data to ODS. Data is presented to the users via 38 CPA reporting views.
Non Oracle and Oracle(non-managed) Data Source Path Projects include:
• SSIS Infrastructure: ECU SSIS Package automation tool, ETL Package Logging, ETL Package Execution Reporting
• RAMSeS (Research Administration Management System & eSubmission): a comprehensive web-based Electronic Research Administration (eRA) system to manage research more efficiently and effectively. Ramses includes several years of proposal and award data.
Oracle Data Source Path
Oracle Data Source Path: Ellucian’s ETL Methodology
Oracle Data Source Path: ECU’s ETL Process
Oracle Data Source Path: ECU ETL OBJECT SUMMARY
ECU CustomETL’S and
Reporting Views Footprint
are Approaching
Ellucian’s
ECU ETL FootprintLoad Groups: Str. Tables ECU ECUODS Total ECU and ECUODS 50* 140 39 179Census Day Freeze 14 14Sciquest 1 21 21BlackBoard(BB)** 13 4 4Degreeworks (DW) 22 22 22
240* In addition to Ellucian - delivered streamed tables** Currently developing new ETL process to freeze BB student grade book data. Process willadd 7 additional OWB maps
Other Objects related to ECU related ETLsReporting views 452
ODSMGR OWB ETL Streamed TablesLoad Maps 252 Schema (PBAN) stage (ODS)Update Maps 225 FAISMGR 403 94 Delete Maps 226 FIMSMGR 571 206
GENERAL 358 111 ODSMGR Reporting views 505 ONESTOP * 297 36
PAYROLL 485 168 PORTAL2 * 144 8 POSNCTL 146 31 SATURN 1,230 436 TAISMGR 158 49
3,792 1,139
* ECU developed Schemas
SCHEMA OWB Maps
ELLUCIANS ODS FOOTPRINT
•System: BlackBoard Learning Management System.
•Requirements: Identify tool/application utilization per Academic Period.
•Tools: Oracle, BB Data dictionary. UMBC BB Project (http://www.umbc.edu/oit/newmedia/blackboard/stats/ )
•Challenges:• Managing 1 billion records for reporting (# of records to determine utilization by College, student profile, and
course attributes). • Identifying application paths for summary data.• Joining BB data with ODS course data.
•Project summary: Developed 4 OWB maps to extract data from streamed tables. Developed 7 reporting views and 7 BIDs reports.
ORACLE Data Path: BlackBoard
Solving the Challenge:
•Developed Summary Composite views grouped by month, course, user and application (Reducing the data for reporting from 1 billion to 20 million records).
•Identified additional application paths by extracting the information from the activity accumulator “DATA” column.
•Mapped BB users by banner ID to ODS. Use ODS person to get Student Demographics.
•Mapped BB course identifiers to ODS by parsing the Course batchID: SUBSTR (C.batch_uid, 7) bb_academic_period, SUBSTR (C.batch_uid, 1, 5) bb_crn where INSTR (batch_uid, '.') = 6.
•Created OWB map to extract ODS Academic Study data into a separate table for Performance Improvement.
ORACLE Data Path: BlackBoard
ORACLE Data Path: BlackBoard
Composite/Object Access to summarize the activity accumulator data
ORACLE Data Path: Black Board
ORACLE Data Path: Black BoardSample Reports
•System: SCIQUEST Requisition (PR) and Purchasing (PO) system (Third party vendor).
•Requirements: Extract PR and PO data from xml message delivered nightly by Sciquest.
•Tools: ORACLE , XMLSPY
•Challenges: • Security: Validating the “original” xml required ORACLE go over the internet to Sciquest.• Performance: Oracle XDB parsing/validating was impacting the production database other processes.• Performance: extracting XML data via views for reporting is sluggish (messages 10 Mbytes+) • Oracle issues pivoting data extracted in XML via relational views.• Handling PO/PR updates. Sciquest xml message contains the latest PO/PR information.
•Project summary: Developed 21 OWB maps to extract xml data from 1 streamed table. Developed 16 PO reporting views, and 13 PR reporting views. User is developing report solution in BIDS.
Oracle Data Source Path: Sciquest
Solving the Challenge:
•Security: Removed Sciquest Schema reference from XML message. REGEXP_REPLACE ( SUBSTR (HTTP_CONTENT, 1, 1000), '(.*?)(<!DOCTYPE.*?">)(.*)|
(xmlns="http://solutions.*?xsd")(.*)', '\1\3')
•Performance validating xml: Streamed xmlreceipt table to ODS. Leverage oracle 11G xmltype/clob which allows “extract xml functions” without validating the entire content. (User assumes xml is valid). No need for XDB to parse the xml.
•Performance querying xmltype: Changed approach for reporting. Data is extracted nightly and appended to composite tables. Reporting views are based on composite tables. No xml extracts in reporting views.
•PO/PR Updates: Created OWB map to track transactions loaded. Created delete OWB maps for PO and PR composite tables. PO and PR OWB maps are set to insert only.
Oracle Data Source Path: Sciquest
Oracle Data Source Path: Sciquest
OWB maps registered with IA Admin
Oracle Data Source Path: Sciquest
XMLSpy Walking the path
Oracle Data Source Path: Sciquest
Used in first OWB map to load the latest xml message Sample xml extract used in other OWB maps(18 composite views use this method)
Oracle Data Source Path: Sciquest
Delete map – Deletes same PO/PR ID if it exist in the Composite tables.
Sample OWB Map:LOAD_EFT_PO_HDR_CUST_FIELDS
•System: Ellucian DegreeWorks. Curriculumn and Planning Tool for Student and advisors
•Requirements: Provide access to DegreeWorks Audit and Planning data to the registrar Office and Advisors via ODS
•Tools: ORACLE. DW CPA reporting guide. DW Sample reports (earlier versions)
•Challenges:
• Data structures in DW (CHAR vs. VARCHAR).
•Project summary: Developed 22 OWB maps. Developed 38 DW reporting views. During development
identified issues with Audit Data, Ellucian provided updated software to correct the data issue. Currently
working the users to develop report solution.
Oracle Data Source Path: DegreeWorks
Solving the Challenge:
•Data structures :Developed script to leverage the data dictionary information and automate the DDL creation for composite views. Composite views cast and trim source tables column based on column data type:
CAST (TRIM (DAP_STU_ID) AS VARCHAR2 (10)) DAP_STU_ID, CAST (TRIM (DAP_SCHOOL) AS VARCHAR2 (12)) DAP_SCHOOL,
ORACLE Data Path:DegreeWorks
ORACLE Data Path: DegreeWorks
ORACLE Data Path: DegreeWorks
OWB maps registered with IA Admin
Non Oracle and Oracle(non-managed) Data Source Path
High Level Overview:ECU ETL SSIS ARCHITECTURE
Integration Services
ODS
FLAT FILES
SQL Agent Job SSIS Package Store SSIS Logging
RDBMS
SOURCE DESTINATION
Web Services
ECU ETL SSIS PROCESS for Non-Oracle Data Sources
SSIS -Integration Services SolutionBusiness Intelligence Development Studio(BIDS)
• Source System: Microsoft SQL Server (Hosted at UNC- Chapel Hill)• Application: RAMSeS• Objects analyzed: 125 Tables + 23 Views = 148 Source Objects• Tools used: Microsoft Business Intelligence Development Studio, C# and .NET, SQL Server Integration Services,
ECU SSIS Package Automation Tool, TOAD, SQL Scripts• Challenges:
• Previously developed ETL Packages took several days to several weeks to complete with only 6- 12 source objects. • Inconsistency when implementing SSIS Package naming/ETL standards combined with standard SSIS design• Previously developed SSIS Packages had no robust logging or Package Execution Reporting.
• Project summary: Microsoft SQL Server Integration Services and ECU’s SSIS Package automation tool are used to create an ETL
package to extract and load data from the Ramses database into the Ellucian/Banner ODS. Integration of the Ramses data into the ODS allows the department of Institutional Research to compile an annual report in under 4 hours which previously required a full 8 hours. Creating initial SSIS Packages has been reduced to under 5 minutes using ECU SSIS Automation Tool. SSIS logging within the package is used to track execution errors, warnings,
package duration. Existing BI Stack-Reporting Services used to host a SSIS Package Execution Summary Report for daily monitoring of SSIS ETL Package Executions.
SSIS Data Path: RAMSeS
SSIS Data Path: RAMSeS
Solving the Challenge:
•Implemented Staging and Target Schemas in the ODS•Utilized SSIS Import/Export Wizard to quickly generate Ramses stage and target Destination tables•Developed the ECU SSIS Package Automation Tool(Script Task, C# .NET) , Integrated ECU’s Methodology and existing ETL Standards to efficiently build a standardized ETL Package.•Developed an Object Mapping table – to support validation •Leveraged existing SSIS Logging features to be configured automatically within the automation tool during package creation •Leverage existing Reporting Services Instance to host a SSIS Package Execution Summary Dashboard developed in BIDS for daily monitoring of Package/ETL Job execution.
ECU ETL SSIS ARCHITECTURE for External Data Sources
6
Integration Services
ODS
FLAT FILES
ETL Job Execution Reporting
SQL Agent Job SSIS Package Store SSIS Logging
RDBMS
SOURCE DESTINATION
5
Web Services
1. SSIS Import Export Wizard (Creates Destination Tables in ODS)
4. Deploy Package to SSIS ETL Server
2. Create Object Mapping Table
3. Generate SSIS ETL Package Encoded with design/ETL Standards
SQL Server Import Export Wizard: Creating Destination Tables in the ECUBIC SSIS Staging Schema(ODS)
Creation of the Object Mapping Table: RAM_OBJECT_MAPPING
ECU SSIS Package Automation Tool: Building the RAMSES ETL Package
SSIS ETL Package Execution Reporting