The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

22
The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction CONFIDENTIAL

Transcript of The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Page 1: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

The RIGHT DATAat the RIGHT TIMEin the RIGHT PLACE

Data Stream ProcessorIntroduction

CONFIDENTIAL

Page 2: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Contents

What Is DSP1

DSP Transfer2

DSP Parser3

DSP Loader4

DSP Generator5

Page 3: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

What is DSP

DSP stands for Data Stream Processor. It is an enterprise system for creating and managing enterprise data streams and for processing batch files.

DSP can transport files in and out of company network, parse them and load the parsed data into different database systems or generate new data files for downstream clients.

DSP is designed to be run by an enterprise scheduler but any of its main processes can be run manually or in a daemon mode (as a background process with configurable sleep time between runs).

DSP can process thousands of files and gigabytes of transactions a day; it can run on any of the popular enterprise platforms such as Linux, Solaris, AIX and Windows.

DSP is more than just a batch file processor. It enables users to create complex internal data streams and perform near-real time data updates on downstream client systems.

Page 4: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Business Advantages Of DSP

Reduce development costs and speed-up time-to-production for new dataLet’s say the IT department gets a request from a business to load new data from a vendor. Let’s assume it takes a developer a minimum one week to write a program to download the file, parse it and load it into a database including all the related tasks such as testing, move-to-production planning, implementation etc. At 8 hours per day and an average cost of $100 per hour the total cost this little project is $4000.

While $4000 may not seem like a big number to a large IT department if one considers that there are usually dozens of these requests a year in an organization like that the savings balloon to tens of thousands of dollars a year, $48,000 just for the first dozen requests.

With DSP all the developer has to do is setup several parameters in a few tables or a configuration file and test it. This shouldn’t take more than a couple of hours to an average developer. In fact a developer with some experience could probably do the whole setup on his or her coffee break.

Page 5: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Business Advantages of DSP

The other critical thing besides the cost reduction is the turn-around speed for the business. With DSP they can have their data usually by the end of the day.

Improve Data QualityWhen you offload the tedious and mechanical work to DSP your developers can concentrate on the data itself. They can spend most of their time analyzing the data

And creating custom programs for data validation. This will improve the data quality which should direct positive impact on the company bottom line.

Simpler maintenance and supportLarge data processing plants often have hundreds of scripts spread out all over the company network. This makes it difficult to even keep track of them, to say nothing about maintenance and support.

With DSP implementation a lot of these can be decommissioned making the maintenance easier. Additionally a lot of the support is thus off-loaded to the vendor.

Lower development riskSince IT doesn’t need to do a lot of the development that can be performed by DSP the department automatically reduces its development risks which are often significant.

Page 6: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

DSP is Simple, Powerful, Flexible

Easy installation

Flexible

processin

g options

Parallel Processing

Clear system logging

with easy error

detection

No

dependen

cies

Extendibility

DSP

Simplicity

Flexibility

Power

The System

was designed

with three main goals

Page 7: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

General System Attributes

The system is easy to administer; easy to install, configure, operate and maintain. All the functionality is contained in a few binary files. There are no application servers to install and maintain, no multitude of jar files to keep track of.

There are multiple ways to configure the system and each run; in the database or in the Configuration file(s) or on the command line or any combination of these. Yet most of the options don’t need to be configured at all because they will default to the most logical setup. This can significantly speed up a new process configuration and make it less error-prone.

All the major components can be run in parallel processing mode allowing better performance and larger throughput of the data.

The system is customizable. It allows new internal variables to be defined and offers internal hooks for external custom code to be run within the system.

Page 8: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Batch Processing Tasks

GetFile Transfer is the movement of files in and out of company network as well as within it

ParseFile Parsing is extracting, validating and formatting data from the files to get it ready for loading

Load File Loading is loading of the parsed data into various databases

DSP supports four most common tasks in any data processing

MakeFile Generating is creating data files from database data or other files

Page 9: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

DSP File Transfer

DSP Transfer moves files

between vendors, clients and users; in and out of the

enterprise network

Enterprise

Customers

Vendors

FTP/SFTPHTTP/HTTPS

Page 10: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

DSP doesn’t just transfer files, it can also do encryption, compression and archiving on them as required within the same transfer process

File Transfer Processing Tasks

File Transfer

Task Sequences within File Transfer

Download

Archive

Decrypt

Upload

Encrypt

CompressDecompress

Archive

Incoming File Outgoing File

Page 11: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Processing Control

Setup “Runtime Limit” for processing

Check if all files are

processed after the runtime

limit

Update the system date

to next processing

date

Processing

Set to

SUCCESS

DSP Transfer allows process monitoring and flexible system date card flipping

Page 12: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Other Transfer Functionality

Ability to setup each file transfer separately or in a bulk using wildcards

Support for immediate local file archiving with its own compression option.

Support for different compression methods such as GZIP and ZIP

Support for different data encryption schemes such as AES and Blowfish

Support for common transfer protocols such as FTP, SFTP, HTTP, HTTPS

Page 13: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

DSP File Parser

Data FileParser

Database Table Files

DSP Parser matches file data to specific database tables and creates table data files ready for loading

Page 14: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

DSP Parser Functionality

• Internal data can be defined and included in the output

• Header & trailer can be validated based on custom rules

• Data can be validated and formatted at the source file level or outfile field level

• Fields in the file can be automatically detected and mapped to tables

Field Mapping

Data Validation

Internal Data

Generation

Header & Trailer

Validation

Page 15: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

DSP File Loader

DSP Loader loads table files into the designated

database tables

Enterprise Databases

Loader

Table Files

Page 16: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

DSP Loader Functionality

• True parallel processing is available e.g. there is a separate process performing each table update not just a thread running within the same process

True Parallel Processing

• The main process keeps track of the tables being updated and prevents other processes from updating them at the same time causing deadlocks

Smart Parallel Processing

• Three update methods are available to best match the DB system including smart updates when only the changed records are updated

Different Types of Updates

• Primary key can be automatically detected and used to update, insert and delete records

Primary Key Detection

• Custom SQL statements can be executed before and after table update

Ad-hoc SQL Processing

Page 17: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

DSP File Generator

Input File

Database

Output File

Output File

File Generator

DSP Generator extracts data from a database or a file and generates a new

data file

Page 18: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Generator Functionality

Database or data file as source

Conditional record

selection

Data validation & formatting

at the file & field level

Internal data

definition and

generation

Page 19: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

DSP Platform & DB Support

DSP MSSQL

ORACLE SYBASE

PostgreSQL*

MySQL

DB2*

* Planned in the future releases

Linux

Windows

DSP Supports most major

platforms and database systems

Solari

s

AIX

Page 20: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Flexible Process Setup

Database

ConfigFile

CommandLine

DSP offers flexible processing setup through the use of 1) System database tables which can be

overridden by 2) Config files which in turn can be

overridden on the 3) Command line

Page 21: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

Planned Functionality

XML SupportParsing, loading, generating of

the XML-type data files

MessagingA bridge for data movement

between files, databases and messaging systems

ArchivingMoving, compressing, encrypting

files locally or remotely across the enterprise network

Future releases will add other significant

functionality to create a comprehensive back end enterprise system

Page 22: The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.

The RIGHT DATAat the RIGHT TIMEin the RIGHT PLACE

CONFIDENTIAL