Shawn McClure Software Engineer CIRA, Colorado State University [email protected]...
-
date post
21-Dec-2015 -
Category
Documents
-
view
219 -
download
0
Transcript of Shawn McClure Software Engineer CIRA, Colorado State University [email protected]...
Shawn McClureSoftware EngineerCIRA, Colorado State [email protected]
Projects:
Visibility Information Exchange Web System (VIEWS)Interagency Monitoring of Protected Visual Environments (IMPROVE)Air Toxics Data Archive (ATDA)WRAP Technical Support System (TSS)
Background:
Systems AnalystDatabase ArchitectProgrammer
“VIEWS-based” web sites...
VIEWS
IMPROVE
ATDA
VIEWS Data Presentation and Analysis
Some possible goals:
Understand the current systemIdentify issues, goals, and requirementsDesign a new systemImplement the new systemMaintain and evolve the system
To help ask a question:
Can we improve upon the data management system* used by the Air Quality Group at CNL?
* System: consists of the collection of manual and automatic processes by which information is collected, managed, and disseminated, and by which work is done.
Why am I here?
Data Redundancy and ConfusionResults when different people independently collect or copy the same piece of information
Can lead to conflicting naming and coding conventions
May result in confusion about which version of the data is most recent or most correct
Program – Data DependenceTight relationship between data and the programs required to update and maintain that data
A change in data requires a change in all the program that work with the data
Changes in data type, field length, etc. cause change in programs
Some common problems with data management…
Lack of FlexibilityDifficulty delivering ad hoc reports and/or responding to unanticipated information
requirements in a timely fashion
Handling ad hoc requests: Data is in the system, but is very expensive (in time and effort) to
assemble and organize
Poor SecurityThere is no mechanism for knowing who is accessing the data and how they're modifying it
Access to the data can be unsystematic and uncontrolled
Some common problems with data management: (cont’d)
Lack of Data Sharing and AvailabilityFinding data can be difficult
Retrieving data can be difficult
Because pieces of information in different files and different parts of the organization cannot
be related to one another, it is virtually impossible for information to be shared or accessed
in a timely manner
Information cannot flow freely across different functional areas or different parts of the
organization
Some common problems with data management: (cont’d)
Functions of a Data Management System1. Getting data into the system2. Working with data while it is in the system3. Getting data out of the system
Data Management System
1
2
Getting Data In
Getting Data Out
Working With Data
3
Source Data
Import
Source Data
Source Data
Validation
DatabaseRules
ProgramLogic
Storage Retrieval Presentation
Analysis Interpretation
Transformation
Back End Front End
Import: Getting data into the system
Validation: Ensuring data accuracy
Storage: Managing data, backup, and archival
Transformation: Sorting, joining, aggregating
Retrieval: Getting the data out
Presentation: Displaying the data
Analysis: Making the data understandable
Interpretation: Making the data usable
A data management system in more detail…
· Fully normalized, third normal form· Best for data import, validation, tracking, and
management· More difficult for end user interaction
Transactional Database Data Warehouse
Data WarehouseTransactional Database
· De-normalized “star” schema· Best for end user interaction, querying, and
front-end applications · Harder to update and automatically maintain
data integrity
Two Possible Approaches: Transactional Database and a Data Warehouse
A Hybrid Approach: Two Interrelated Relational Database Systems
OLTP:OLTP:
• Functions as the “back-end” database• Fully relational and in 3rd normal form• Used for data import, validation, and
management• Technologies: Microsoft SQL Server
Data Warehouse Generation System:Data Warehouse Generation System:
• Extracts data from the OLTP• De-normalizes and transforms data• Loads data into the Data Warehouse• Builds table indexes• Archives “snapshots” of the database• Technologies: VB, stored procedures
Data Warehouse:Data Warehouse:
• Functions as the “front-end” database• Uses a de-normalized “star schema”• Used for querying and archiving data• Automatically generated from the OLTP• Technologies: Microsoft SQL Server
12
Determine our goals: What are we trying to achieve?
Design a strategy: How are we going to achieve it?
Identify problem areas: What do we have to watch out for?
Prioritize tasks: What is most important for success?
Adjust scope: What set of goals is most realistic?
Recommend alternatives: How else could we do things?
Allocate resources: How do we support our efforts?
Realign expectations: How do we communicate any adjustments?
Define milestones: How do we know when we're done?
General Tasks and Associated Questions
What components/aspects of the current CNL-AQG data management system(s) are we
interested in examining?
How do we determine our priorities? (i.e. What should come first, next, and later?)
How do we make any necessary “transitions” with minimal impact to current operations?
What new “learning curves” will be involved?
How will we need to “shift our paradigms”?
How do we maintain a new system?
How do we know when we’ve achieved what we want?
Some Issues and Concerns…