A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction...

13
A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was one of the first universities to use data mining for retention and all the way through the cycle to include intervention as well as recruitment. High school students most likely to attend the university and freshman at risk of dropping out are for example identified. Of particular interest is the fact that this was a joint effort between data mining students in the Department of Statistics and the Enrollment Office. This lead to the idea to establish cooperation between the Planning Unit and the Department of Mathematical Statistics and Actuarial Science at the UFS. In the Statistics department a post graduate course in Data Mining is presented using SAS Enterprise Miner. The fact that this course has a practical component which constitutes at least 40% of the final mark, creates the opportunity to involve the students in Institutional Research activities. A total of twenty projects were identified and one assigned to each of the thirty students enrolled for this course. The data used for these projects are from the student database of the UFS.

Transcript of A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction...

Page 1: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

A way to integrate IR and Academic activities to enhance institutional effectiveness.

Introduction

The University of Alabama (State of Alabama, USA) was one of the first universities to use data mining for retention and all the way through the cycle to include intervention as well as recruitment. High school students most likely to attend the university and freshman at risk of dropping out are for example identified. Of particular interest is the fact that this was a joint effort between data mining students in the Department of Statistics and the Enrollment Office.

This lead to the idea to establish cooperation between the Planning Unit and the Department of Mathematical Statistics and Actuarial Science at the UFS. In the Statistics department a post graduate course in Data Mining is presented using SAS Enterprise Miner. The fact that this course has a practical component which constitutes at least 40% of the final mark, creates the opportunity to involve the students in Institutional Research activities. A total of twenty projects were identified and one assigned to each of the thirty students enrolled for this course. The data used for these projects are from the student database of the UFS.

Page 2: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

The Post Graduate Course

Currently only an introductory data mining course is presented. For optimal results it will be necessary to introduce a further more advanced data mining course.

Course 1: Introduction to Data Mining.

In this course SAS Enterprise Miner and the use of predictive models are introduced. A broad overview to the modeling techniques of Logistic Regression, Decision Trees, and Neural Networks are provided. The concepts of data partitioning, model assessment using lifts charts and ROC curves, and model implementation are presented. The project is part of this course.

Course 2: Advanced Data Mining.

This should provide a more in-depth coverage of the technical aspects of each of the modeling tools discussed in the first course. Topics in Statistical Decision Theory and unsupervised learning can also be included.

Page 3: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

1X ID Identification number of student

2X RACE Race of student

GENDER Gender of student

CAMPUSY1 Campus of registration for year one

FACULTYY1 Faculty for year one

MINYEARSTOGRAD Minimum years to obtain qualification registered for EXTENDEDPY1 Is the qualification registered for an extended program? (Y,N,NOTAV)

AGEY1 Age of the student when registering for year one H_LANGUAGE The home language of the student

M_COUNT The M-count obtained by the student

NUMCREDITS1Y1 The total number of credits registered for in the first semester of year one

PROPCREDITSPASSED1Y1 The proportion of credits passed in the first semester of year one

NUMCREDITSY1 The total number of credits registered for in year one

PROPCREDITSPASSEDY1 The proportion of credits passed in year one

Y TARGET The binary dependent variable which can be 1, to indicate success, and 0 to indicate failure

3X

4X

5X

6X

7X

8X

9X

10X

11X

12X

13X

14X

The dataset and the variables

Page 4: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

The projects

The following projects were identified:

1. Build predictive models using decision trees, regression, and neural networks to identify successful students in Faculty A at the end of the first year of study.

Definitions: Success is defined as the event that a student in Faculty A completes the qualification registered for in year one in the minimum time.

Failure is defined as the event that a student in Faculty A fails to complete the qualification registered for in year one in the minimum time.

Variables included in dataset:

ID, RACE, GENDER, CAMPUSY1, MINYEARSTOGRAD, EXTENDEDPRY1, AGEY1, H_LANGUAGE, M_COUNT, NUMCREDITSY1, PROPCREDITSPASSEDY1, TARGET

Page 5: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

Preliminary results for Project 1:Faculty of Natural and Agricultural Sciences

Page 6: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

Faculty of Economic and Management Sciences

Page 7: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

Faculty of the Humanities

Page 8: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

Faculties can now be compared.

Consider the rule that leads to the highest probability for success in each of the three faculties:

Rule for faculty of Natural and Agricultural Sciences:

If PROPCREDITSPASSEDY1> 0.89 and NUMCREDITSY1 >142,then P(Success) = 0.41

Rule for faculty of Economic and Management Sciences:

If PROPCREDITSPASSEDY1>0.81, NUMCREDITY1>116.5, andM_COUNT>40.5, then P(Success)=0.59

Rule for faculty of the Humanities:

If M_COUNT>34.5 and PROPCREDITSPASSEDY1>0.74, then P(Success)=0.58

Page 9: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

2. Build predictive models using decision trees, regression, and neural networks to identify students likely to dropout from Faculty A at the end of year one.

Definitions: Dropout is defined as the event that a student, who did not graduate at the end of year one, is not registered at the beginning of year two for any qualification in Faculty A. Only data available at the end of the first semester should be used.

No-Dropout is defined as the event that a student is still registered for a qualification in Faculty A (not necessarily the same qualification as in year one).

Variables included in dataset:

ID, RACE, GENDER, CAMPUSY1, MINYEARSTOGRAD, AGEY1, H_LANGUAGE, M_COUNT, NUMCREDITS1Y1, PROPCREDITSPASSED1Y1, TARGET

Page 10: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

3. Build predictive models using decision trees, regression, and neural networks to identify students in Faculty A likely to pass more than 90% of the courses registered for in year one. Only data available at the time of registration should be used.

4. Build predictive models using decision trees, regression, and neural networks to identify students in Faculty A likely to pass less than 20% of the courses registered for in year one. Only data available at the time of registration should be used.

Variables included in dataset:

ID, RACE, GENDER, CAMPUSY1, MINYEARSTOGRAD, AGEY1, H_LANGUAGE, M_COUNT, TARGET

“Faculty A” can be : A. HumanitiesB. EducationC. Natural and Agricultural SciencesD. Business and Management SciencesE. All faculties

Page 11: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

The advantages

1. Students are exposed to real life data sets.

2. Students are introduced to an aspect of Institutional Research and will gain insight in the challenges universities are faced with.

3. Recent computing advances have created an increased demand for Business Intelligence (BI) professionals. The courses are designed to educate students to meet the marketplace demand.

4. Cooperation and understanding between support services and academics are promoted.

5. The university is provided with BI to facilitate the making of strategic decisions on a large scale since several projects will run simultaneously.

6. Projects can be updated every year to accommodate new enrollments.

7. Possible changes over time in predictive models can be investigated.

8. The projects will enable comparison between faculties. It will for example be possible to compare the indicators for a dropout in Faculty A with that of Faculty B.

Page 12: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

The challenges

1. Well equipped computer laboratories should be available for student use.

2. Students with an insufficient level of computer literacy should be prevented from entering the course.

3. A data warehouse should be in place and properly maintained for reliable results.

4. The student projects should be closely monitored and supervised.

Page 13: A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.

Thank you