Idea

Learning toChange Projects

Raymond Borges, Tim Menzies

Lane Department of Computer Science & Electrical EngineeringWest Virginia University

PROMISE’12: Lund, SwedenSept 21, 2012

[email protected] (LCSEE, WVU, USA) Learning toChange Projects PROMISE ’12 1 / 18

Sound bites

Less predicition, more decision

Data has shape

“Data mining” = “carving” out that shape

To reveal shape, remove irrelvancies

Cut the cr*pUse reduction operators: dimension, column, row, rule

Show, don’t code

Once you can see shape, inference is superflous.Implications for other research.

Decisions, Decisions...

Tom Zimmermann:

“We forget that the original motivation for predictive modeling wasmaking decisions about software project.”

ICSE 2012 Panel on Software Analytics

“Prediction is all well and good, but what about decision making?”.

Predictive models are useful

They focus an inquiry onto particular issues

but predictions are sub-routines of decision processes

Tom Zimmermann:

Q: How to Build Decision Systems?

1996: T Menzies, Applications of abduction: knowledge-level modeling,International Journal of Human Computer Studies

Score contexts e.g. Hate, Love; count frequencies of ranges in each:

Diagnosis = what went wrong. δ = Hate(now)− Love(past)

Monitor = what not to do. δ = Hate(next)− Love(now)

Planning = what to do next. δ = Love(next)− Hate(now)

δ = X − Y = contrast set = things frequent X but rare in Y

TAR3 (2003),WHICH (2010),etc

But for PROMISE effort estimation data

Contrast sets are obvious...

... Once you find the underlying shape of the data.

TAR3 (2003),WHICH (2010),etc

Diagnosis = what went wrong.

δ = Hate(now)− Love(past)

TAR3 (2003),WHICH (2010),etc

Monitor = what not to do.

δ = Hate(next)− Love(now)

TAR3 (2003),WHICH (2010),etc

Planning = what to do next.

δ = Love(next)− Hate(now)

TAR3 (2003),WHICH (2010),etc

Q: How to find the underlying shape of the data?

Data mining = data carving

To find the signal in the noise...

Timm’s algorithm

1 Find some cr*p

2 Throw it away

3 Go to 1

IDEA = Iterative Dichomization on Every Attribute

Timm’s algorithm

1 Find some cr*p

2 Throw it away

3 Go to 1

1 Dimensionality reduction

2 Column reduction

3 Row reduction

4 Rule reduction

And in the reduced data, inference is obvious.

Timm’s algorithm

1 Find some cr*p

2 Throw it away

3 Go to 1

2 Column reduction

3 Row reduction

4 Rule reduction

Timm’s algorithm

1 Find some cr*p

2 Throw it away

3 Go to 1

2 Column reduction

3 Row reduction

4 Rule reduction

1 Dimensionality reduction (recursive fast PCA)

Fastmap (Faloutsos’94)

W = anything

X = furthest from W

Y = furthest from X

Takes time O(2N)

Let c = dist(X,Y)

If Z has distance a,b to X,Y thenX projects to a2+c2−b2

Platt’05: Fastmap = Nystrom algorithm = fast & approximate PCA

W = anything

X = furthest from W

Y = furthest from X

Takes time O(2N)

Let c = dist(X,Y)

W = anything

X = furthest from W

Y = furthest from X

Takes time O(2N)

Let c = dist(X,Y)

W = anything

X = furthest from W

Y = furthest from X

Takes time O(2N)

Let c = dist(X,Y)

W = anything

X = furthest from W

Y = furthest from X

Takes time O(2N)

Let c = dist(X,Y)

2 Column reduction (info gain)

Sort columns by their diversity

Keep columns that select for fewest clusters

e.g. nine rows in two clusters

cluster c1 has acap=2,3,3,3,3; pcap=3,3,4,5,5

cluster c2 has acap=2,2,2,3; pcap=3,4,4,5

p(acap = 2) = 0.44 p(acap = 3) = 0.55p(pcap = 3) = p(pcap = 4) = 0.33 p(pcap = 5) = 0.33

I(col) =∑