Identifying Critical Factors in Case-Based Prediction

17
Identifying Critical Factors in Case- Based Prediction R. Weber College of Information Science & Technology Drexel University

description

R. Weber College of Information Science & Technology Drexel University. Identifying Critical Factors in Case-Based Prediction. Outline. Case-Based Prediction, Critical Factors Motivation Background: Use of Domain Knowledge Methods to Identify Critical Factors - PowerPoint PPT Presentation

Transcript of Identifying Critical Factors in Case-Based Prediction

Page 1: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based Prediction

R. WeberCollege of Information Science & Technology

Drexel University

Page 2: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Outline1. Case-Based Prediction, Critical Factors2. Motivation3. Background: Use of Domain Knowledge4. Methods to Identify Critical Factors

Gradient descent, Logistic regression, Feature-oriented

Case-based, Knowledge-based, Union 5. Comparative Study

Dataset, Methodology, Results6. Conclusions7. Future Work

Page 3: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Case-Based Prediction• The predicted outcome can be:

– Irreversible• Path of natural disasters, e.g. hurricane, tornados

– Reversible• Ongoing project outcome, project effort, cost; health

conditions• Critical Factors:

– features (feature-value) that support the outcome

– significant changes in their values can potentially reverse the prediction either alone or in conjunction with changes in values of other critical factors

– Critical Success and Critical Failure Factors

Page 4: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Motivation• Assumption:

– Users are interested in prediction of reversible outcomes so they can reverse unwanted predictions

• Health conditions, project/system failure• Aamodt and Nygaard (1995):

– Consider the entire application context (including user’s perspective) to maximize usefulness of CBR systems

• Motivation:– Case-based prediction systems that do not indicate

effective and efficient ways to reverse unwanted outcomes do not take into account the user’s perspective.

– Find a minimal set of critical factors that maximize the chances of reversing unwanted outcomes

Page 5: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Background on Case-Based Prediction

• ICCBR 2001: Kadoda et al. has stated that design decisions depend on the dataset

• FLAIRS 2002: Watson et al. has evaluated different design decisions because of such bias

• CBRW91: Cain, Pazzani, Silverstein proposed EBL+CBR to improve accuracy of case-based prediction when features outnumber cases

• ICCBR03: Weber et al. confirmed the improvement in accuracy (scarce data, bias) against other CBR techniques and logistic regression Microsoft

PowerPoint Presentation

Page 6: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Methods to Identify Critical Factors

Scope• Personalized

– Methods that identify failure and success factors that are specific to the case under assessment and to its actual values

• Collective– They only identify the features– Provide trends based upon a community of

cases. When this community consists of real world experiences, they represent evidence of the importance of these factors

Page 7: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Collective Methods• Gradient descent

– Critical factors are those features whose resulting importance values are above the overall average.

• Logistic regression– Critical factors are those features with the

strongest correlations to the outcome and then these features are used for prediction purposes

• Feature-oriented – Using LOOCV, submit a project description for

prediction and observe the resulting accuracy; then, submit each feature separately and the success factors the features that produce accuracy closest to the overall accuracy of true positives and as failure factors the ones with overall accuracy closest to true negatives

Page 8: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Personalized Methods• Case-based

– Failure factors are feature-value pairs that co-occur in both the target case and in the similar case(s) that was(ere) used to predict failure in the target???????

• Knowledge-based– Submit new case to the EBL method to identify

relevance factors with the resulting prediction– In predictions of failure, the feature-values

assigned relevance factors are critical failure factors

– For the remaining features, we replaced the predicted outcome to assign relevance factors for the alternate outcome

• Union– We combined the knowledge-based and the case-

based methods by taking the union of the factors each individually identify.

Page 9: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Comparative Study: Dataset• Dataset

– 20 out of 88 real cases of software development projects

– 23 symbolic features – The 12 out of 21 projects have all

originally failed and when submitted to the EBL+CBR prediction, they were predicted to fail.

Page 10: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Comparative Study: Methodology • Methodology consists of 3 stages:

– 1) Identification of critical factors– 2) Overturn– 3) Prediction

Page 11: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

• GD maximizes reversal but does minimize the set of factors

• Feature-oriented is the most efficient• Methods currently used performed most

poorly

Results for Collective Methods

Page 12: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Results for Personalized Methods

Results for Knowledge-Based Overturning

Page 13: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Knowledge-Based Overturning• Personalized

– Different methods are able to reverse a project’s prediction using different sets of factors, and one method reversed a prediction contrary to domain knowledge.

• Collective– GD failed to reverse one project. However,

when we perform knowledge-based overturning we found that it still cannot reverse that one project. More interestingly, some projects are no longer reversed.

Page 14: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Conclusions ?? Recommendations

• Domain specific conclusion– 2 factors were identified by all of them

• a well defined scope• end users having time for requirements gathering--

• Domain knowledge combined with contextual experiential knowledge may uncover knowledge

• Define the level of reversibility of factors, e.g., using measures of efficiency of factors throughout the dataset and by project. Factors that are easy to reverse should receive priority.

Page 15: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Future Work• Case-based framework to learn:

– Weights for EBL rules– Dependencies between rules– Dependencies between factors

• How to use contextual knowledge embedded in cases to reverse unwanted outcomes?– Use collective methods to identify

critical factors and then use cases to assess their potential to reverse unwanted outcomes

Page 16: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Acknowledgements• Co-authors

– William Evanco, Michael Waller, June Verner

• Colleagues– This and previous work

• Anonymous reviewers• National Institute for Systems Test

and Productivity

Page 17: Identifying Critical Factors in Case-Based Prediction

Identifying Critical Factors in Case-Based PredictionR Weber 18-May FLAIRS 2004

Questions? Ideas? Comments?