Transforming the Student Administration RDS into a Historical Data Archive Session #10591 March 9,...
-
date post
20-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of Transforming the Student Administration RDS into a Historical Data Archive Session #10591 March 9,...
Transforming the Student Administration RDS into a Historical
Data Archive Session #10591
March 9, 2004 11:50-12:50
HEUG 2004 Conference - Atlanta
Session #10591-Ray Helm 2
Ray Helm
Research Analyst, Office of Institutional Research and Planning
University of Kansas - Lawrence
Ryan CherlandDirector, University Management Information,
Associate Director, Office of Institutional Research and Planning
University of Kansas - Lawrence
Session #10591-Ray Helm 3
Ray Helm is a Research Analyst with the Office of Institutional Research and Planning. He joined the University of Kansas as a Programmer Analyst with Decision Support Services in 2000. He has been involved in reporting from PeopleSoft systems for over 3 years and was the original RDS Administrator at KU.
Ryan Cherland is Director of University Management Information and Associate Director of the Office of Institutional Research and Planning. He has over 18 years of experience in institutional research, with the last 11 years at the University of Kansas-Lawrence. Ryan created DEMIS and has been involved with extracting and reporting on data from PeopleSoft systems since 1996. Ryan has a Ph.D. in Higher Education Administration with a minor emphasis inEducational Psychology and Research.
Session #10591-Ray Helm 4
Synopsis
This presentation reviews the methodology used to create a historical archive of student administration data by
customizing PeopleSoft’s RDS service.
•Review Database needs
•Setting Census Parameters
•Modifying Decision Stream Fact Builds
•Modifying Decision Stream Dimension
•Special Jobstreams
•Reporting Examples
Session #10591-Ray Helm 5
• History - The University of Kansas opened its doors in 1866.
• Academics - The university offers more than 100 undergraduate and graduate majors and programs including allied health, architecture, business, education, engineering, fine arts, journalism, law, liberal arts and sciences, nursing, pharmacy, and social welfare.
• Lawrence Campus Enrollment – 20,692 undergraduates and 6,122 graduate students from every state in the nation and more than 100 countries around the world.
• FTE students per FTE faculty ratio: 14.7
About the University of Kansas
Session #10591-Ray Helm 6
PeopleSoft Student Admin/RDS at KU
• PS Version 8 implemented for Fall, 2003 enrollment, Financial Aid module implementation pending
• RDS installed October, 2002
• Census RDS created August, 2003
• All databases are Oracle 8.17 running on Unix servers
• Cognos Decision Stream runs on Windows 2000 server
Session #10591-Ray Helm 7
Why Build a Census RDS?
• Daily RDS provides current snapshot, not designed for historical reporting
• Needed capacity to report across semesters to track historical trends and fulfill institutional research needs.
• Running queries against transactional “live” data was not desired.
• Census extract begun under legacy student records system needed to be continued
Session #10591-Ray Helm 8
Census RDS Resource Needs
•Requires no additional software purchase
•Only additional resources needed are:
•Storage Space
•Staff time to create, execute, and update
•DBA assistance in setup and occasional support
Session #10591-Ray Helm 9
KU’s RDS Catalogs and Databases
• Decision Stream Catalog Databases• Production Catalog for Daily RDS • Development Catalog for RDS Admins• Census Catalog• PS Delivered Catalog (Static)
• RDS Databases• Daily Production data• Development Test data• Production Census data
Session #10591-Ray Helm 10
1. Replicate Daily RDS Catalog database as Census Catalog database. If Daily RDS Catalog cannot be copied, restore using Decision Stream Catalog Backup/Restore process.
2. Create output database. Size should be roughly equal to Daily RDS database for starters. If possible make a database copy of Daily RDS to migrate users and roles.
Getting Started: Create RDS Census Databases
Session #10591-Ray Helm 11
Create CEN_PARAMS Table
•Same structure as ODS_PARAMS:VARIABLE_NAMERESULT
•Supplements ODS_PARAMS as source ofDecision Stream variable values
•Stores variables specific to census process
Session #10591-Ray Helm 12
Census Variables
• CENSUS_TERM Term code as it is referenced in PS source data.
• CENSUS Unique identifier that will distinguish census point records within the Census RDS
• DATE_CUTOFF Official date and time when the census data was collected
Session #10591-Ray Helm 13
ACTIVE***SQL Variables
Variables containing SQL statements that identify active students, instructors, recruiters,
and applicant
• ACTIVESTUDENTSQL
• ACTIVEINSTRUCTORSQL
• ACTIVERECRUITERSQL
• ACTIVEAPPLICANTSQL
Session #10591-Ray Helm 14
CENS_PARAMS Examples
VARIABLENAME RESULT
CENSUS_TERM 4042
CENSUS 2004201
ACTIVEINSTRUCTORSQL (SELECT EMPLID FROM PS_CLASS_INSTR WHERE STRM='4042')
DATE_CUTOFF 01-23-2004 00:10
Session #10591-Ray Helm 15
Decision Stream Fact Build
Session #10591-Ray Helm 16
RDS Fact Build Modification
• Add Census variables to Fact Build Properties and modify existing CURDATE_SOURCE
• Modify DataStream SQL against source tables
• Add CENSUS variable to Transformation
• Add CENSUS variable to Fact Delivery, with index
• Change Delivery Method from Truncate to Append
Session #10591-Ray Helm 17
Adding Variables to Fact Build Properties
Session #10591-Ray Helm 18
Modify CURDATE_SOURCE • Change from being the current date to the value of
DATE_CUTOFF
• Variable Expression changed from:
LOOKUP('ODS_CURRENT', 'SELECT RESULT FROM ODS_PARAMS WHERE VARIABLENAME=''CURDATE_SOURCE''')
To:
Concat( 'TO_DATE(''', LOOKUP('ODS_CURRENT', 'SELECT RESULT FROM CENS_PARAMS WHERE VARIABLENAME=''DATE_CUTOFF'''), ''',''MM-DD-YYYY HH24:MI'')')
Session #10591-Ray Helm 19
Modifying SQL WHERE Clause• Replace
STRM BETWEEN {$START_TERM} AND {$END_TERM}
with AND STRM={$CENSUS_TERM}
• Leave {$CURDATE_SOURCE} unchanged, value in DATE_CUTOFF will replace system date value when variable is resolved.
• Add AND EMPLID IN {$ACTIVE***SQL} as needed
Session #10591-Ray Helm 20
Add CENSUS Derivation to Transformation
Session #10591-Ray Helm 21
Add CENSUS to Fact Delivery
Session #10591-Ray Helm 22
Change Delivery Method to Append
Session #10591-Ray Helm 23
Result of Fact Build Modifications
• Extract against source data selects only records pertaining to current term with effective dates before specified cutoff date
• CENSUS column added to output table to identify census point for each record
• New census data appended to existing data to create historical record
Session #10591-Ray Helm 24
Modifications to Dimensions
• SHARED_LOOKUPS and XLATTABLE
• No changes needed if CURDATE_SOURCE variable references DATE_CUTOFF value
• Review referenced lookups as each build is modified is recommended
• Dimensions Referencing Fact Builds
• Add “AND CENSUS={$CENSUS}” to WHERE clause in Lookup DataStream SQL
Session #10591-Ray Helm 25
Creating New Jobstreams
• CONFIGURE_CENSUS_RUN Jobstream
Changes values in CENS_PARAMS table to current census point
Updates census point information table
• ROLLBACK Jobstream
Allows for removal of all data for a specific census point
Run only when errors or run failures occur.
Session #10591-Ray Helm 26
CONFIGURE_CENSUS_RUN Jobstream
Session #10591-Ray Helm 27
EDIT_CENS_PARAMS SQL NodeEdit/Update Census variables
UPDATE CENS_PARAMS SET RESULT='4042' WHERE VARIABLENAME='CENSUS_TERM'; UPDATE CENS_PARAMS SET RESULT='2004201' WHERE VARIABLENAME='CENSUS'; UPDATE CENS_PARAMS SET RESULT='01-23-2004 00:10' WHERE VARIABLENAME='DATE_CUTOFF';
Session #10591-Ray Helm 28
EDIT_CENS_PARAMS SQL NodeTEMP_PARAMS table
CREATE GLOBAL TEMPORARY TABLE TEMP_PARAMS (CTERM CHAR(4), STR1 CHAR(255), STR2 CHAR(20) );
INSERT INTO TEMP_PARAMSSELECT TRIM(TRAILING FROM RESULT),'', '‘FROM CENS_PARAMSWHERE VARIABLENAME='CENSUS_TERM';
Session #10591-Ray Helm 29
EDIT_CENS_PARAMS SQL NodeUPDATE TEMP_PARAMSSET STR1='(SELECT EMPLID FROM PS_CLASS_INSTRWHERE STRM='''WHERE CTERM IS NOT NULL;
UPDATE TEMP_PARAMSSET STR1=TRIM(TRAILING FROMSTR1)||CTERM||TRIM(TRAILING FROM STR2)WHERE CTERM IS NOT NULL; UPDATE CENS_PARAMSSET RESULT=(SELECT STR1 FROM TEMP_PARAMSWHERE CTERM IS NOT NULL)WHERE VARIABLENAME='ACTIVEINSTRUCTORSQL';
Session #10591-Ray Helm 30
ROLLBACK_CENSUS_POINT Jobstream
Session #10591-Ray Helm 31
Rollback Procedure Variables
Session #10591-Ray Helm 32
SQL1: BUILD ROLLBACK TABLE
CREATE TABLE ROLLBACK1 (ROWNUMB NUMBER(8), TABLE_NAME CHAR(255)); INSERT INTO ROLLBACK1 SELECT ROW_NUMBER() OVER(ORDER BY TABLE_NAME) , TABLE_NAME FROM ALL_TABLES WHERE OWNER='RDSSA'AND (SUBSTR(TABLE_NAME,1,3) IN('ADM','CC_','FA_','REC','SF_','IR_','XSY')OR TABLE_NAME='CENSUS_POINT_INFO_TBL');
Session #10591-Ray Helm 33
CENSUS ROLLBACK PROCEDURE
while $COUNTER <= RB_MAX()do BEGIN$CNTR:=$COUNTER;$TABLE:=RB_GETTABLE($CNTR);LOGMSG(CONCAT('ROLLING BACK CENSUS POINT ',$CENS,' FROM TABLE: ',$TABLE));RB_DelRows($TABLE,$RB_CENSUS);$COUNTER:=$COUNTER+1;END
Session #10591-Ray Helm 34
UDF: RB_MAX()Returns Total Table Count
• Implementation: Internal Calculation
• Returns Integer value calculated as:
RETURN LOOKUP('ODS_CURRENT','SELECT MAX(ROWNUMB) FROM ROLLBACK1');
Session #10591-Ray Helm 35
UDF: RB_GETTABLE($CNTR)Returns name of table to rollback
• $CNTR passed to UDF
• Internal Calculation returning table name:
RETURN LOOKUP('ODS_CURRENT', CONCAT('SELECT TABLE_NAME FROM ROLLBACK1 WHERE ROWNUMB=',TOCHAR($CNTR)));
Session #10591-Ray Helm 36
RB_DelRows($TABLE,$RB_CENSUS)
Deletes rows from table• $TABLE (table name) and $RB_CENSUS (Census value)
are passed to UDF
• Internal Implementation executes:
SQL('ODS_CURRENT',CONCAT('DELETE FROM ', $TABLE, ' WHERE CENSUS=''', $CENS,''''));
• Example:
RB_DELROWS(‘MYTABLE’,’2004201’) executes:DELETE FROM MYTABLE WHERE CENSUS=‘2004201’
Session #10591-Ray Helm 37
CENSUS ROLLBACK PROCEDURE
while $COUNTER <= RB_MAX()do BEGIN$CNTR:=$COUNTER;$TABLE:=RB_GETTABLE($CNTR);LOGMSG(CONCAT('ROLLING BACK CENSUS POINT ',$CENS,' FROM TABLE: ',$TABLE));RB_DelRows($TABLE,$RB_CENSUS);$COUNTER:=$COUNTER+1;END
Session #10591-Ray Helm 38
Migrating changes in Daily RDS to Census RDS
• New data needs are being identified as analysts from across the University begin using the Daily RDS, and as development of the transactional system continues.
• Changes to Daily RDS are relatively simple, since tables are truncated daily.
• Changing Census RDS tables is more difficult:
•Need to add new columns without losing existing data
•How do you get data for added columns (or new tables) for past census points?
Session #10591-Ray Helm 39
Adding a New Table to Census RDS
•Adding new fact build is relatively simple:
•Export Daily RDS fact build and related dimensionsusing Decision Stream CATEXP
•Import into Census RDS using DS CATIMP
•Modify fact build using methods described previously
•Decide how to populate past census points:•Restore census point backups one at a time andexecute build?
•Use current data? Leave empty?
Session #10591-Ray Helm 40
Adding a New Column to Existing Fact Build
•More complicated than adding a new build
•Need to modify existing table, cannot just create new
•Potential for losing data exists (so be careful!)
•Backing up both Catalog and Output databases beforebeginning is highly recommended
Session #10591-Ray Helm 41
Adding a New Column to Existing Fact Build (Cont.)
•Restoring census point backups and running fact build
•Requires effort from technical support staff•Ensures data accuracy
Session #10591-Ray Helm 42
Database Size Comparisons
DATABASE SIZE (MB)
Daily RDS 13,500
Census RDS
(4 census points)
17,300
Session #10591-Ray Helm 43
Options for Reducing Census RDS Size
•Drop tables/fact builds for unused tables
•Add SQL procedure steps to drop Staging Tables
•Staging tables account for 1,250 MB of storage (7.2%)
•Alternative: leave Staging table delivery as Truncate
•Remove unused/unnecessary columns from tables
•Can you live with just one DESCR?
•Drop unneeded indexes
Session #10591-Ray Helm 44
Making Use of Census RDS Data
Ryan Cherland
Session #10591-Ray Helm 45
Examples of how Institutional Research uses the Census RDS
• Creation and storage of a Census single record per student Reporting Datamart table
• Audit checks– A source for audit checks with production
SQRs run against the SASTATIC SA database
• Reports– Census Class Roster
Session #10591-Ray Helm 46
Single Record Report Table
• Contains basic bio-demographic data• Campus location where majority of credit hours are
taken (only location for this)• Various credit hour and FTE fields covering campus
location, special hours (ROTC, Dissertation/Thesis), overall hours, etc.
• “Primary” school, major, and student level selected from all active programs and plans
• Test scores (SAT, ACT, GRE)• 104 fields in total
Session #10591-Ray Helm 47
Creation of REPORTING_DMART table
• Audit reports and correction efforts are made prior to the Census day in the Live SA database, but some errors are too complicated or new to be fixed in time
• SQR written by SA developers runs against the SASTATIC database and creates a flat-file and error check report
• So this table is where corrections are applied to the flat-file (wrong student levels, missing program / plan stacks, etc.) before loading into SARDSCEN using SAS
Session #10591-Ray Helm 48
Screen Shots of Fields in Reporting Datamart…
Session #10591-Ray Helm 49
Example of an Audit Check Report
• Departmental Load Analysis– A summarized report by school, department,
and course level of credit hour enrollment in courses
– The OIRP SAS program using the SARDSCEN data allows us to validate the data in the SARDSCEN against the SQR version of this report that is run against SASTATIC
Session #10591-Ray Helm 50
Results for the School of Business for the Fall 2003 Semester
Session #10591-Ray Helm 51
Reporting from the SARDSCEN database
• Advantages in having the same data structures– Can quickly point the report to using the
SARDSCEN database or the daily SARDS database
• Familiarity with the daily SARDS allows other “non-IR” staff to start using the historical data with confidence
Session #10591-Ray Helm 52
Using SAS macro code to “flip” which database is being accessed
/* Uncomment these for daily access */
%LET LIB=SARDS;
%let cen=;
/* Uncomment these for CENSUS access */
%*LET LIB=SARDSCEN;
%*LET cen=CENSUS EQ '2003920' AND;
/* Set Term Code below for desired semester */
%*let trm=4039;
%let trm=4042;
Session #10591-Ray Helm 53
SAS Macro fields in the query…
PROC SQL;
create table enrolled as
select * from &LIB..rec_enrollment_fact_ku
where &CEN ENRL_STRM="&TRM"
and enrl_status_cd='E'
and class_sid in (select distinct class_sid
from &LIB..rec_class_dim
where &CEN class_term_cd="&trm"
and class_subject_cd='TH&F')
order by class_sid;
Session #10591-Ray Helm 54
Output Results using Census
Session #10591-Ray Helm 55
Output Results using Daily
Session #10591-Ray Helm 56
ConclusionCreating a historical archive version of the RDS from a copy of the daily
RDS is a bit labor intensive, but the results are well worth it!!
Ray Helm, [email protected] Research Analyst, University of Kansas
Ryan Cherland, [email protected] Director University Management Information, University of Kansas - Lawrence
http://www.heug.org Attendees may download HEUG 2004 presentations from HEUG On-
Line