IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

58
IUP’s Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta

Transcript of IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Page 1: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

IUP’s Production Data Warehouse

By

Indiana University of Pennsylvania

Daniel J. Kuta

Page 2: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Agenda Introduction Overview of the hardware/software

environment Overview of the data warehouse items

that have been implemented Items that have worked well Items that have been a challenge

Page 3: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Agenda Future directions Useful references

Page 4: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Introduction Database administrator at IUP Graduate of IUP March 2004 – 19 years at IUP Worked primarily with Office of the

Registrar, Graduate School, Undergraduate Admissions and some Financial Aid.

Page 5: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

About IUP Approx.13,800 students; 1,800 employees Largest Member, SSHE 3 campuses; 1 center; 1 academy More than 100 undergraduate programs,

close to 50 master’s degree programs and 8 programs leading to a doctoral degree.

Clock-hour programs

Page 6: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Banner at IUP Implemented five baseline modules and

three “Web For” products 1998-2000 Banner 5.x (soon to be Banner 6) Oracle 9i, OAS (soon to be 9IAS) Sun Solaris

Page 7: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Post Implementation SSD, FAMIS, Workflow, TouchNet,

Resource25, IDWorks, CSI Web For Admissions and Web For Alumni Quest Central For Oracle Dozens of custom-written programs and

web applications Large data warehousing initiative

Page 8: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Banner IT Support at IUP Application Development Group

1 Coordinator 1 Senior Systems Analyst/DBA 2 Senior DBAs 7 Developers

Miscellaneous Entities User Services, Tech. Services, Acad. Support

Reps., Power Users

Page 9: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Hardware/software environment Development environment

Dell PowerEdge 2500 server 1.266 GHz CPU 1 Gb RAM 72 Gb disk storage Windows 2000 Server operating system Oracle 9.2.0.4 Enterprise Edition

Page 10: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Hardware/software environment Production environment

Sun Sparc Ultra-4 server 4 - 296 MHz CPUs 2 Gb RAM 208 Gb total disk storage Sun Solaris 5.6 operating system Oracle 8.1.6.3 Enterprise Edition

Page 11: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Initial End User Base Staff within the office of the Vice Provost for

Administration and Technology Staff within the University Planning and

Analysis and Institutional Research area.

Page 12: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Current projects and their impetus Replacement of an Institutional Research

database. Migration of routines from MS Access

queries to packaged PL/SQL procedures. Want the ability to “prove” or “justify” the

data.

Page 13: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Current Meeting Structure Meet with representatives from the

Associate Provost and the Planning and Analysis areas approximately every three weeks.

Set agenda of topics to discuss Email/phone call follow-ups as needed

between meetings.

Page 14: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Overview of Data Warehouse items that have been implemented

User starting point Warehouse web site Intro page lists the data sets or subject areas

available with a brief description. Last extract/freeze date recorded. Next extract/freeze date identified.

Page 15: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Overview of Data Warehouse items that have been implemented

Student Grades Provides data that allows for the analysis of

grades and program review.

Course Master Allows further analysis of the above, plus

enrollment and credit hours generated by student level within a course.

Page 16: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Overview of Data Warehouse items that have been implemented

Course Schedule Provides data that allows for the monitoring of

enrollment levels in courses at determined intervals.

Allow for attrition/migration analysis.

Quarterly Financial Summarizations A rollup of financial data within Fund,

Organization, Program and Account Code.

Page 17: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Overview of Data Warehouse items that have been implemented

Payroll Source is the bi-weekly payroll Focus feed’s

received from the SSHE payroll system. Data regarding earnings, benefits, deductions,

etc. is recorded. Data is linked back to Banner position numbers,

giving a tie back to the FOAPAL strings responsible for the expense.

Page 18: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Extract all data needed to a staging

database and build/rebuild from there. All columns traced back to their source. Once a table was “touched” with a column or

columns, the entire row is extracted to the staging database. All columns are pulled.

Page 19: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Extract all data needed to a staging

database and build/rebuild from there. Staging tables mimic the layout of their source

tables. They include an additional column that identifies their “freeze id.”

All rows required from the source tables are extracted and tagged in the staging database with an indicator to tie them together – the “freeze id” column is populated.

Page 20: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Extract all data needed to a staging

database and build/rebuild from there. The builds of the data sets are now based on the

staged data. Any subsequent rebuilds all run from the same

staged data.

Page 21: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Extract all data needed to a staging

database and build/rebuild from there. Benefit: Consistent builds/rebuilds. Not hitting a

moving target with data from the Banner production database.

Benefit: We’re able to “prove” and “justify” the builds of the data sets.

Page 22: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Model-based construction for the extracts

from Banner production to staging. Parameter/profile tables were created to identify

the source tables required to build the data sets for a subject area.

The tables also identified any special SQL “FROM” or “WHERE” clause logic that was needed to extract the data.

Page 23: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Model-based construction for the extracts

from Banner production to staging. If a table was not listed as requiring any special

SQL extract logic, the entire table was pulled. This was used to pull copies of required Banner

validation tables, usually needed for some transformations or code descriptions.

Page 24: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Model-based construction for the extracts

from Banner production to staging. Parameter/profile tables assisted with...

The generation of the scripts that created the tables in the staging database.

The generation of the SQL extract scripts to pull data from the Banner production database to the staging database.

Page 25: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Model-based construction for the extracts

from Banner production to staging. Parameter/profile tables assisted with...

Scripts to provide “record counts” of data pulled to staging.

Scripts to delete data from test runs of the extract scripts in the staging database.

Page 26: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Initially started with Java programs

generating the scripts... CREATE TABLE scripts for the staging database The extract scripts The record count scripts The delete scripts

Page 27: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Initially started with Java programs

generating the scripts... Running the extract scripts in this manner

worked well for high volume, low frequency extracts – 3 per year.

It was a manageable process. However...

Page 28: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well PL/SQL packaged procedures were

created to dynamically create and execute the SQL extract scripts. Need dictated by low volume, high frequency,

off-hours extracts. Additional tables were created to record run-time

parameters and the job’s results.

Page 29: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well PL/SQL packaged procedures were

created to dynamically create and execute the SQL extract scripts. Procedures run “unattended”, logging their

results.

Page 30: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Builds of the data sets are done by PL/SQL

packaged procedures. Call to execute a “build procedure” with passed

parameters that identify the “freeze data” to use.

Vast majority of the transformations and description lookups coded as PL/SQL functions. Benefit: reusability

Page 31: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well The completed data sets are built in the

staging database. This allows for an analysis of the builds by

validation procedures.

Page 32: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well After validation of the new data sets in the

staging database, the new data sets are then copied into the data warehouse. Separate procedures are used to perform the

migration of the data from staging to the data warehouse.

Page 33: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have worked well Once the updated data sets have been

migrated into the data warehouse… The data warehouse web site is updated to

reflect the status of the data sets available. Keeping the web site updated and current is

necessary to gain user buy-in to use it. Otherwise, expect phone calls asking for the

status of...

Page 34: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have been a challenge User dictated design – Replacement of the

existing IR database. Too many databases and queries were already

written and dependent on the existing structure.

Page 35: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have been a challenge Discovery of all existing transformations

Transformations hidden in a vast array of MS Access queries.

Special “fix” routines coded in SQL scripts run through SQL*Plus.

Page 36: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have been a challenge Missing data

Analysis of the builds sheds light on data missing from the Banner production database.

Resolution: Identify critical data. Verify it is available prior to performing the

extracts.

Page 37: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have been a challenge User availability

Subject matter experts must be available to provide needed information and feedback in a timely manner.

Page 38: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Items that have been a challenge Parallel builds of the data

Difficulty in coordinating parallel builds of the data sets within both systems in order to perform validation of the new procedures.

User testing Parallel builds performed – Yeah! User participation in the validation of the builds

was lacking.

Page 39: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans Complete the deactivation of the old IR

database. SSHE-related semester freezes.

Add additional functionality to the “job execution” environment. Currently logs start time, end time and duration

of the entire job.

Page 40: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans Add additional functionality to the “job

execution” environment. Will have it log each “job step” or extract it is

performing. Record the start time, end time and duration of the

step. Metadata on the target table: initial storage

requirements, it’s needs after the extract and the change in those requirements.

Page 41: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans Add additional functionality to the “job

execution” environment. Keep the “build” and “migration” procedures, but

add procedure calls to perform the logging of the job’s metadata.

Page 42: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans Existing project in the queue for financial

reporting. Desire is to have flexible, responsive, rollup

reporting. Detail data must be available for drilldown. Look to model budgets, commitments, payments,

revenue, etc.

Page 43: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans Existing project in the queue for financial

reporting. Challenges:

No intimate knowledge of Banner Finance. First truly dimensional model. Some Ragged Hierarchies. Implementation of change data capture

procedures.

Page 44: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans Change the focus of the data warehousing

projects. Currently, too heavy on mandated state

reporting. It’s focus is on reporting the past, or “what has

happened.”

Page 45: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans Change the focus of the data warehousing

projects. Need to direct attention to the detection of trends

and our reaction to them. And yes, you do need historical data to do that.

But it must be in the proper format to easily answer the questions that are asked.

Page 46: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans As a simple example, running a University

(or any business) is a lot like driving a car... Can you successfully get to where you want to

be by constantly looking in the rear view mirror? You must look out the front windshield and focus

on what you see. Like it or not, there’s stuff coming at you!

Page 47: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans As a simple example, running a University

(or any business) is a lot like driving a car... You must navigate around any obstacles you

encounter. But this is only short-term success, a nice

leisurely drive. You need direction, a destination, and a “road

map” to get there.

Page 48: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans As a simple example, running a University

(or any business) is a lot like driving a car... The strategic plan of the university defines it’s

goals – it’s “destination.” If so, what’s our plan or “road map” look like in

trying to get to reach that destination? Have we aligned our data warehouse initiatives

with that plan?

Page 49: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Future directions/plans As a simple example, running a University

(or any business) is a lot like driving a car... Are we collecting and analyzing the data needed

to measure our progress at reaching that destination?

What triggers a change, a “detour” or “alternate route” in the journey?

Page 50: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Conclusion Satisfied with the environment setup to

perform the extracts, builds and migrations of the data sets.

Users are satisfied with what they are receiving.

Page 51: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Conclusion Yes, I feel a level of frustration that the

initiatives have focused on mandated reporting – the “What happened?” reporting.

Need to implement structures to capture and provide more metadata on the data sets and the procedures and functions that build them.

Page 52: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Useful references Books

Building the Data Warehouse - W. H. Inmon© 1996 – John Wiley & Sons

The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses – Ralph Kimball© 1996 – John Wiley & Sons

Page 53: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Useful references Books

The Data Warehouse Toolkit – Second EditionThe Complete Guide to Dimensional ModelingRalph Kimball, Margy Ross© 2002 – Wiley Computer Publishing

Data Warehouse Design SolutionsChristopher Adamson, Michael Venerable© 1998 – Wiley Computer Publishing

Page 54: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Useful references Books

Designing a Data Warehouse: Supporting Customer Relationship ManagementChris Todman – Hewlett Packard Professional Books© 2001 – Prentice Hall Publishing

Page 55: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Useful references Books

Mastering Data Warehouse Design: Relational and Dimensional TechniquesClaudia Imhoff, Nicholas Galemmo, Jonathan G. Geiger© 2003 Wiley Publishing

Page 56: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Useful references The Data Warehousing Institute

www.dw-institute.com

Intelligent Enterprise www.intelligententerprise.com

DM Review www.dmreview.com

Page 57: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Useful references Bill Inmon’s web sites

www.inmoncif.com www.inmongif.com

Ralph Kimball’s web site www.ralphkimball.com

Oracle 9.2 documentation set

Page 58: IUPs Production Data Warehouse By Indiana University of Pennsylvania Daniel J. Kuta.

Questions? Comments?

Dan Kuta

[email protected]

(724) 357-2887