The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.
-
Upload
jodie-peters -
Category
Documents
-
view
218 -
download
1
Transcript of The RIGHT DATA at the RIGHT TIME in the RIGHT PLACE Data Stream Processor Introduction.
The RIGHT DATAat the RIGHT TIMEin the RIGHT PLACE
Data Stream ProcessorIntroduction
CONFIDENTIAL
Contents
What Is DSP1
DSP Transfer2
DSP Parser3
DSP Loader4
DSP Generator5
What is DSP
DSP stands for Data Stream Processor. It is an enterprise system for creating and managing enterprise data streams and for processing batch files.
DSP can transport files in and out of company network, parse them and load the parsed data into different database systems or generate new data files for downstream clients.
DSP is designed to be run by an enterprise scheduler but any of its main processes can be run manually or in a daemon mode (as a background process with configurable sleep time between runs).
DSP can process thousands of files and gigabytes of transactions a day; it can run on any of the popular enterprise platforms such as Linux, Solaris, AIX and Windows.
DSP is more than just a batch file processor. It enables users to create complex internal data streams and perform near-real time data updates on downstream client systems.
Business Advantages Of DSP
Reduce development costs and speed-up time-to-production for new dataLet’s say the IT department gets a request from a business to load new data from a vendor. Let’s assume it takes a developer a minimum one week to write a program to download the file, parse it and load it into a database including all the related tasks such as testing, move-to-production planning, implementation etc. At 8 hours per day and an average cost of $100 per hour the total cost this little project is $4000.
While $4000 may not seem like a big number to a large IT department if one considers that there are usually dozens of these requests a year in an organization like that the savings balloon to tens of thousands of dollars a year, $48,000 just for the first dozen requests.
With DSP all the developer has to do is setup several parameters in a few tables or a configuration file and test it. This shouldn’t take more than a couple of hours to an average developer. In fact a developer with some experience could probably do the whole setup on his or her coffee break.
Business Advantages of DSP
The other critical thing besides the cost reduction is the turn-around speed for the business. With DSP they can have their data usually by the end of the day.
Improve Data QualityWhen you offload the tedious and mechanical work to DSP your developers can concentrate on the data itself. They can spend most of their time analyzing the data
And creating custom programs for data validation. This will improve the data quality which should direct positive impact on the company bottom line.
Simpler maintenance and supportLarge data processing plants often have hundreds of scripts spread out all over the company network. This makes it difficult to even keep track of them, to say nothing about maintenance and support.
With DSP implementation a lot of these can be decommissioned making the maintenance easier. Additionally a lot of the support is thus off-loaded to the vendor.
Lower development riskSince IT doesn’t need to do a lot of the development that can be performed by DSP the department automatically reduces its development risks which are often significant.
DSP is Simple, Powerful, Flexible
Easy installation
Flexible
processin
g options
Parallel Processing
Clear system logging
with easy error
detection
No
dependen
cies
Extendibility
DSP
Simplicity
Flexibility
Power
The System
was designed
with three main goals
General System Attributes
The system is easy to administer; easy to install, configure, operate and maintain. All the functionality is contained in a few binary files. There are no application servers to install and maintain, no multitude of jar files to keep track of.
There are multiple ways to configure the system and each run; in the database or in the Configuration file(s) or on the command line or any combination of these. Yet most of the options don’t need to be configured at all because they will default to the most logical setup. This can significantly speed up a new process configuration and make it less error-prone.
All the major components can be run in parallel processing mode allowing better performance and larger throughput of the data.
The system is customizable. It allows new internal variables to be defined and offers internal hooks for external custom code to be run within the system.
Batch Processing Tasks
GetFile Transfer is the movement of files in and out of company network as well as within it
ParseFile Parsing is extracting, validating and formatting data from the files to get it ready for loading
Load File Loading is loading of the parsed data into various databases
DSP supports four most common tasks in any data processing
MakeFile Generating is creating data files from database data or other files
DSP File Transfer
DSP Transfer moves files
between vendors, clients and users; in and out of the
enterprise network
Enterprise
Customers
Vendors
FTP/SFTPHTTP/HTTPS
DSP doesn’t just transfer files, it can also do encryption, compression and archiving on them as required within the same transfer process
File Transfer Processing Tasks
File Transfer
Task Sequences within File Transfer
Download
Archive
Decrypt
Upload
Encrypt
CompressDecompress
Archive
Incoming File Outgoing File
Processing Control
Setup “Runtime Limit” for processing
Check if all files are
processed after the runtime
limit
Update the system date
to next processing
date
Processing
Set to
SUCCESS
DSP Transfer allows process monitoring and flexible system date card flipping
Other Transfer Functionality
Ability to setup each file transfer separately or in a bulk using wildcards
Support for immediate local file archiving with its own compression option.
Support for different compression methods such as GZIP and ZIP
Support for different data encryption schemes such as AES and Blowfish
Support for common transfer protocols such as FTP, SFTP, HTTP, HTTPS
DSP File Parser
Data FileParser
Database Table Files
DSP Parser matches file data to specific database tables and creates table data files ready for loading
DSP Parser Functionality
• Internal data can be defined and included in the output
• Header & trailer can be validated based on custom rules
• Data can be validated and formatted at the source file level or outfile field level
• Fields in the file can be automatically detected and mapped to tables
Field Mapping
Data Validation
Internal Data
Generation
Header & Trailer
Validation
DSP File Loader
DSP Loader loads table files into the designated
database tables
Enterprise Databases
Loader
Table Files
DSP Loader Functionality
• True parallel processing is available e.g. there is a separate process performing each table update not just a thread running within the same process
True Parallel Processing
• The main process keeps track of the tables being updated and prevents other processes from updating them at the same time causing deadlocks
Smart Parallel Processing
• Three update methods are available to best match the DB system including smart updates when only the changed records are updated
Different Types of Updates
• Primary key can be automatically detected and used to update, insert and delete records
Primary Key Detection
• Custom SQL statements can be executed before and after table update
Ad-hoc SQL Processing
DSP File Generator
Input File
Database
Output File
Output File
File Generator
DSP Generator extracts data from a database or a file and generates a new
data file
Generator Functionality
Database or data file as source
Conditional record
selection
Data validation & formatting
at the file & field level
Internal data
definition and
generation
DSP Platform & DB Support
DSP MSSQL
ORACLE SYBASE
PostgreSQL*
MySQL
DB2*
* Planned in the future releases
Linux
Windows
DSP Supports most major
platforms and database systems
Solari
s
AIX
Flexible Process Setup
Database
ConfigFile
CommandLine
DSP offers flexible processing setup through the use of 1) System database tables which can be
overridden by 2) Config files which in turn can be
overridden on the 3) Command line
Planned Functionality
XML SupportParsing, loading, generating of
the XML-type data files
MessagingA bridge for data movement
between files, databases and messaging systems
ArchivingMoving, compressing, encrypting
files locally or remotely across the enterprise network
Future releases will add other significant
functionality to create a comprehensive back end enterprise system
The RIGHT DATAat the RIGHT TIMEin the RIGHT PLACE
CONFIDENTIAL