Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 ›...

9
Education Data Warehousing Analyzing trends and uncovering value in education data White Paper

Transcript of Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 ›...

Page 1: Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 › 58970915 › ... · 2020-03-21 · Accelerating Education Data Warehousing with Informatica

Education Data Warehousing Analyzing trends and uncovering value in education data

White Paper

Page 2: Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 › 58970915 › ... · 2020-03-21 · Accelerating Education Data Warehousing with Informatica

This document contains Confidential, Proprietary and Trade Secret Information (“Confidential information”) of QC Technology

Decisions Inc. and may not be copied, distributed, duplicated, or otherwise reproduced in any manner without the prior written

consent of QC Technology Decisions Inc.

While every attempt has been made to ensure that the information in this document is accurate and complete, QC Technology

Decisions Inc. does not accept responsibility for any kind of loss resulting from the use of information contained in this

document. The information contained in this document is subject to change without notice.

Published August 2013

Copyright © 2013 QC Technology Decisions Inc. All rights reserved.

Page 3: Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 › 58970915 › ... · 2020-03-21 · Accelerating Education Data Warehousing with Informatica

Accelerating Education Data Warehousing with Informatica Data Quality

Page 3

$610

Race to the Top

SLDS Program

Overview

State education agencies face many challenges to sustain the benefits now

being realized as a result of significant federal investments in the US Education

System. As of March 2013, The American Recovery and Reinvestment Act

(ARRA) has awarded $6.5 billioni for Race to the Top, a competitive grant

program designed to encourage and reward states that are creating the

conditions for education innovation and reform; achieving significant

improvement in student outcomes, and ensuring student preparation for

success in college and careers. Additionally, the Statewide Longitudinal Data

Systems (SLDS) Grant Program has invested $610 millionii (of which ARRA

contributed $360 millioniii) since 2005 through five rounds of funding to 47

states, the District of Columbia, Puerto Rico, and the Virgin Islands.

While it is clear that there has been a significant investment from federal

government programs due to a variety of factors, at least 34 states and the

District of Columbia have implemented cuts to K-12 education and over 43

states have implemented cuts to public colleges and universities and/or made

large increases in college tuition to make up for insufficient state funding.iv In

light of this budget cutting trend, it is important that states act now and focus

on developing a sustainable model for their longitudinal data systems.

The SLDS program provides grants to design, develop, and implement

statewide P-20 longitudinal data systems to capture, analyze, and use student

data from preschool to high school, college, and the workforce. Although

many states have begun to see value from their investments in these data

systems, the industry believes that there is significantly more value in such

data when it is linked with other “Big Data”.

Optimizing the value derived from these systems requires some skills that

most IT departments lack today. Historically, analytics have been built

without looking at data sets that were very large and complex. Stakeholders

were only tasked with analyzing subsets of structured data – and never

considered the other (and sometimes unstructured data) within the

organization. This meant that there were only a limited number of questions

that could be asked. As the data grows, so do the questions that stakeholders

will want to ask.

The Data Quality Campaign is a strong advocate for empowering stakeholders

with education data. They have defined ten actions that once implemented,

will:

Increase efficiency: Ensure that our education systems produce the

greatest return on investment possible

Improve system performance: Inform system-wide management

and evaluation decisions

Increase transparency: Shine a light onto the education system to

see what works and what does not

Improve student achievement: Inform all education stakeholders’

decisions with quality data to help them make the best decisions to

improve every student’s achievement

Figure 1 Total Investment in SLDS Programs (Millions)

“When schools fail, our children and our neighborhoods suffer." U.S. Secretary of Education, Mr. Arne Duncan, March 11, 2013

Figure 2 Map of Actions Implemented

Page 4: Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 › 58970915 › ... · 2020-03-21 · Accelerating Education Data Warehousing with Informatica

Although there has been great progress as a result of the significant

investment into SLDS programs, few states have implemented all of the ten

actions required to meet their goals—further demonstrating the need for

additional investment into the SLDS program.

QC Technology Decisions recognizes that most states are actively working to

define and implement these (or similar) actions under numerous budget,

resource, and time constraints. QC has the necessary skills, partners, and

solutions required to fill any gaps that states may have reaching their goals or

sustaining their SLDS solutions.

Education Data Warehousing Solutions Data warehouses have been around for a long time and many organizations

are great at creating them. The data warehouse, admittedly, is a critical

component to the SLDS solution; however, a number of other components are

required to work together to deliver a functional data warehousing solution

that meets the stakeholders’ goals.

As each project is unique, so can the solution for getting answers to questions

using data be unique. As a result, there is strong desire to conform existing

systems, merge data from new systems, and link data to other systems in

order to make sense of the readily available information. As states evolve and

mature their data warehousing practices more and more data is conformed

using a variety of methods to meet their business objectives. There is a

“tipping point” however, where data becomes unmanageable; for many, this

point was crossed the moment they began to link their K-12 education data

with other state systems such as workforce, higher education, and other data

systems.

Figure 2 Education Data Warehousing Solution

“Insystech has provided excellent IT

consulting services to my team at

Fairfax County Public Schools. The

IT consultants provided to us were

high quality professionals who had

the experience and skills that we

needed. FCPS has enjoyed a long

term relationship with Insystech.

The company has provided quality IT

services to our school district since

1997.”

Ken Rice, Coordinator, Instructional Systems, Fairfax County Public Schools, VA

Page 5: Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 › 58970915 › ... · 2020-03-21 · Accelerating Education Data Warehousing with Informatica

Accelerating Education Data Warehousing with Informatica Data Quality

Page 5

A modern approach to Education Data Warehousing makes managing data

possible through:

Collecting and matching data stored in the current systems while

providing capability to support data exchange to and from other

sources as the system evolves

Organizing a comprehensive longitudinal data repository based on

Data Quality Campaign (DQC) standards consisting of several

integrated data marts that store information for all aspects of the

school system, including student enrollment, student test scores,

student grades, transcripts, program participation, teacher

education/professional development, and school infrastructure

Presenting student trends, performance, progress, and program

effectiveness longitudinally through a common and easy to use web

portal using ad-hoc and interactive reports, charts, and dashboards

Controlling access to information and system capabilities for various

groups including the general public, parents, teachers, principals,

school administrators, analysts, and system

administrators/developers

Data Collection When developing complex data collection and matching processes, data

structures, and reports that make up the data warehouse solution, QC follows

a Kimball approach (considered industry best practice) to data warehousing.

The solution must be designed to integrate data from multiple source systems

and multiple agencies. Data systems that would act as source for the data

warehouse are analyzed to help develop both the business and technical

team’s understanding of the availability and quality of data in the source

systems. This information is also used to complete functional gap analysis

identifying potential gaps in the availability of data for the reports and

dashboards planned for the system.

Data Matching A component essential to the sustainability of the solution is the controlled

vocabulary for which business rules are established to interpret data in the

data warehouse. It is natural for each source system to have their own set of

terms used to describe the data elements they provide to the SLDS; however,

a number of attributes such as “demographic attributes” must be conformed

to establish proper links between the source data to support longitudinal

analysis.

Data transformation (ETL) process are developed to load the data from the

source systems to meet the requirements of the data warehouse, and are

conformed and linked using record matching technology which overcomes

issues such as misspelling, transposition of data elements, internationalization

of names, as well as numerous other data quality issues.

Distribution reports can help decision makers quickly identify best and worst performers (example: What K-12 courses, high school career clusters, programs and other experiences are preparing students well for STEM postsecondary degrees?)

Page 6: Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 › 58970915 › ... · 2020-03-21 · Accelerating Education Data Warehousing with Informatica

Data Organization Data is loaded into a comprehensive, multi-dimensional, statewide

longitudinal education data warehouse. ETL processes keep the data updated

and organized so that it may be managed consistently over time and

presented to end-users as reports.

For each facet of information managed by the system, additional

characteristics (many of which are shared) that describe these facets are

linked to facilitate data management, reporting, and analysis. These

dimensions help describe what kinds of events and characteristics are

measured over an individual’s (and by extension to groups) learning lifetime.

As educators use this information to make decisions that improve student

performance and/or education policies, an improvement cycle is created;

thus, users of the system can see trends in key performance indicators over

time. This information helps all stakeholders develop a better understanding

of their students which can help institutions better prepare students for the

21st century workforce.

Figure 3 Preparing the 21st Century Workforce

Page 7: Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 › 58970915 › ... · 2020-03-21 · Accelerating Education Data Warehousing with Informatica

Accelerating Education Data Warehousing with Informatica Data Quality

Page 7

Data Presentation End users will usually access the system via a single web based business

intelligence portal, through which unstructured information may be shared

such as documents, web pages, blogs etc., along-side interactive dashboards

and reports.

Since the SLDS program serves as an authoritative data repository for all

education programs, it creates an opportunity for a variety of different

stakeholders to access a common set of data at different levels of aggregation

and scope:

Student data such as performance data for each student in his/her

class might be made available for teachers. They can use this

information to aggregate that data to analyze trends, determine

what content needs to be reinforced, and decide how to alter

teaching methods to ensure students acquire the content

knowledge and skills in the state’s college and career-ready or

internationally bench-marked standards.

School data to help guide staff time and resources might be

available to school administrators such as teacher assignment,

evaluation, and professional development, student course

assignments and targeted supports, and interim and summative

testing. Of course, administrators will also have access to data that

helps identify which students from which classrooms are off-track to

scoring proficient on end-of-grade or end-of-course exams in key

subjects and why.

School district administrators may have access to data to help

improve curriculum and practices both for their institution(s) and in

under-performing schools to allocate teacher and staff resources,

and to provide professional development opportunities.

Policy makers may have access to valuable data on a robust set of

key performance indicators that measure and report school and

district progress towards college and career readiness goals,

including measures of progress, such as Adequate Yearly Progress.

They may be able to identify schools and districts in need of targeted

supports and interventions and those deserving of recognition for

outstanding achievements. Policy makers can also use the data to

analyze trends across schools and districts helping them evaluate

policies.

Report Annual Trends in KPIs alongside other measures to help discover/predict why and when changes occur (example: What K-12 courses, programs and other experiences are predictive of college readiness?)

Interactive dashboards to display frequently requested information on KPIs at a glance (example: How much student loan debt do students leave compared to their employment rates and earnings?)

Comparing multiple student cohorts over time can help answer specific questions to help support changes to policies (Example: What is the profile of high school students (across time) entering and non-entering some type of postsecondary education immediately following high school graduation)

Page 8: Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 › 58970915 › ... · 2020-03-21 · Accelerating Education Data Warehousing with Informatica

Education Data Warehousing in Practice In 2012, QC Technology Decisions Inc. as a sub-contractor to Insystech Inc.

completed the design, development, and implementation of a P-20 Statewide

Longitudinal Education Data (SLED) System for the West Virginia Higher

Education Policy Commission. The solution included a data warehouse,

dashboards, and governance that now provides answers to questions from the

state legislature, higher education research staff, K-12 staff including

administrators, teachers, and workforce research staff.

Governance Governance describes the overriding authority and legislative requirements

for all architectural, design, development, and business decisions, including all

policies and procedures created on behalf of the agencies responsible for the

system. The West Virginia P-20 SLED implementation involved the effective

interaction of multiple groups, individuals, and disciplines. To provide a

unified, centrally governed approach throughout the lifecycle of the solution,

a governance model was implemented in conjunction with the technical

solution.

Data Model At a very high level, the solution created for West Virginia structured the data

warehouse to the processes for which education programs are managed and

delivered to students. The data model was developed based on data

warehousing best practices, input from industry experts, and lessons learned.

The P-20 model is the evolution of an award winning K12 data warehouse

created in partnership with Fairfax County Public Schools called EDSL which,

since 2000, continues to be supported and enhanced by Insystech serving over

170,000 students and 25,000 educators. The current model stores outcomes

and other information at point in time events such as graduation, taking a

test, enrolling in a school, getting approval for a student loan, completing a

grade, etc. Each event in the student’s academic history may have one or

more outcomes which are recorded as “facts” that are aggregated at the level

in which it is collected (typically at the student level). This method of

classifying events in the data model enables policy makers to develop a

framework for exploring the data warehouse in terms that are easy to

understand from a business perspective.

Functionality Ability to easily connect to any contributing systems to extract, transform

and load data into the data warehouse

Data exchange capabilities to provide for the easy submission of data

from districts and other partners

Unique record matching technology with “master person index”

which overcomes issues such as misspelling, transposition of data

elements, internationalization of names, as well as numerous other data

quality issues

Comprehensive, multi-dimensional statewide longitudinal education data

warehouse organizing student records from pre-school through age 20

years including enrollment, assessments, earnings, financial aid, debt,

and much more.

Subject specific data marts for cohort comparisons and analysis of various

types high school graduates and degree seekers in areas such as high

school feedback, college performance, college indebtedness and income

earnings, remedial development, workforce feedback, STEM Pipeline,

dual enrollment, teacher education, and proficiency testing predictors.

Public and private web portals with interactive dashboards and

scorecards

Technical Environment

The following off the shelf products were used

to implement the P-20 SLDS solution for West

Virginia:

Network/Operating System: Windows 2008

R2, Secure Public Access (SSL 256bit), Internet

Information Server 7.0, N-Tier Server

Architecture, and VMWare Virtualization

Database: Oracle 11g R2 Enterprise Edition

with Partitioning Option

ETL: IBM Cognos Data Manager 10.1, IBM

Cognos Adaptive Warehouse 10.1, and Oracle

PL/SQL stored procedures

Portal/Reporting: IBM Cognos 10.1, SharePoint

2010

“Insystech team worked well with our

research staff to define standard taxonomy

for the data, gather and process data, design

and develop dashboards and reports.

Insystech team of experts also assisted in the

setup of governance procedures to help

sustain the system.”

Rob Anderson

Executive Vice Chancellor for Administration

West Virginia Higher Education Policy

Commission

Page 9: Education Data Warehousing - QC TECHNOLOGY DECISIONS INC. › uploads › 5 › 8 › 9 › 7 › 58970915 › ... · 2020-03-21 · Accelerating Education Data Warehousing with Informatica

Accelerating Education Data Warehousing with Informatica Data Quality

Page 9

Summary QC Technology Decisions Inc. has contributed to the evolution of Education Data Warehousing through the design and development of statewide data warehouse solutions such as the West Virginia P20 SLED (State-Wide Longitudinal Education Data) and local education data warehouse solutions for school districts in the United States. Our team has practical experience integrating data from student information systems, state extracts, and testing solutions using leading integration platforms such as Informatica and SQL Server Integration Services. We can architect and implement solutions using leading business intelligence platforms like Oracle OBIEE, IBM Cognos, Tableau and Microsoft SharePoint. Our data warehousing professional services include:

Enterprise Architecture

Stakeholder outreach and change management

Requirements analysis

Data integration strategy, planning and implementation

Data quality improvement strategy, planning and execution

Enterprise Data Warehouse design, development and maintenance

Extract, transform and load (ETL) process design and development

Custom web application development

Systems Integration

Web Services development

Portal development

Training and documentation

QC Technology Decisions Inc. 26 Crestline Drive

Fredericton, NB E3G 6B1 (506) 470-8938

[email protected]

www.qctechnology.com

© 2013 QC Technology Decisions Inc. All rights reserved. Printed in Canada. QC Technology Decisions Inc., the QC logo are trademarks or registered trademarks of QC Technology Decisions Inc. in Canada and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owner.

i Recovery.gov, http://www.recovery.gov/espsearch/Pages/advanced.aspx?data=recipientAwardsList&AwardType=CGL&CFDA_CODE=257 (March 20, 2013) ii Institute of Education Sciences National Center for Education Statistics, http://nces.ed.gov/programs/slds/stateinfo.asp, (March 20, 2013) iii Recovery.gov, http://www.recovery.gov/espsearch/Pages/advanced.aspx?data=recipientAwardsList&Agency=91&AwardType=CG&CFDA_CODE=256 (March 20, 2013) iv An Update on State Budget Cuts, http://www.cbpp.org/cms/index.cfm?fa=view&id=1214 (February 9, 2011)

About QC Technology Decisions QC Technology Decisions Inc. is best known as a K-12 educational technology firm specializing in business intelligence and enterprise information management. Since 2005, our team has been providing professional IT services in both the United States and Canada to Local and State Education Agencies. Our team has extensive experience and knowledge in working with the development of educational policy, evaluation of educational programs as well as extensive design, deployment and management of information technology solutions including the design and deployment of data warehouse systems.