The effects of duration based moving windows with estimation by analogy - sousouke amasaki

17
The Effects of Durationbased Moving Windows with Estimation by Analogy Sousuke Amasaki* Chris Lokan Okayama Prefectural University* UNSW Canberra Mensura 2015 in Kracow, Poland 1

Transcript of The effects of duration based moving windows with estimation by analogy - sousouke amasaki

The  Effects  of  Duration-­‐based  Moving  Windows  

with  Estimation  by  Analogy

Sousuke  Amasaki*  Chris  Lokan†  

 Okayama  Prefectural  University*  

UNSW  Canberra†

Mensura  2015  in  Kracow,  Poland 1

In  Mensura  2012,  we  focused  on  Moving  Windows  for  Effort  Estimation  

with  Estimation  by  Analogy

2 Mensura  2015  in  Kracow,  Poland

Project Data for Training EbA Effort Estimation Model

A target project to be estimated

Drop off old project,

it maybe useless Retain

Window Size

A new target

past future

Conclusion in Mensura 2012 paper •  MW could improve accuracy with EbA •  Weaker effects with EbA than Linear Regression

Window  policies  matter

3

p MW  was  examined  with  LR  and  two  policies  [IST2014]*  p  Fixed-­‐size  

p Retain  N  projects  in  a  window  p  Fixed-­‐duration  

p Retain  projects  within  N  months  

p  Results  show  the  difference  in  accuracy  improvement        *  C.  Lokan,  E.  Mendes.  Investigating  the  use  of  duration-­‐based  moving  windows  to  improve  software  effort  prediction:  A  replicated  study,  Information  and  Software  Technology  56(9)  ,  pp.  1063–1075,  2014.  

Mensura  2015  in  Kracow,  Poland

Today’s  talk  is  about  Duration-­‐based  Moving  Windows

4 Mensura  2015  in  Kracow,  Poland

past future

Fixed-size (Mensura 2012)

Fixed-duration

EbA with pre-selected features (Mensura 2012) EbA with on-time feature selection (for reality)

Research  Questions

5 Mensura  2015  in  Kracow,  Poland

Is  there  a  difference  in  the  accuracy  of  estimates  between  EbA  with  pre-­‐  and  on-­‐time  selections  using  fixed-­‐size  windows?  

RQ1. Reconfirmation of Mensura 2012 results

Is  there  a  difference  in  the  accuracy  of  estimates  with  and  without  MW  with  the  revised  EbA  and  fixed  duration  windows?  

RQ2. Evaluation of Fixed-Duration Windows

RQ3. Comparison between window policies

How  do  these  results  compare  with  results  based  on  fixed-­‐size  windows?  

The  revised  EbA

Mensura  2015  in  Kracow,  Poland 6

p  Select  features  on  the  basis  of  the  whole  dataset  p  Wrapper  approach  

p  Use  simple  mean  for  estimation  

Mensura 2012

p  Select  features  for  every  new  target  project  p  Lasso  for  reducing  computation  costs  

p  Use  inverse  rank  weighted  mean  (IRWM)  for  estimation  

This study

Unrealistic to use future projects

Contribute to estimation accuracy

Dataset

Mensura  2015  in  Kracow,  Poland 7

Properties p  Highly quality rated as A or B by ISBSG p  Size Measured with IFPUG 4.0 or later p  Known Actual effort p  Not web projects p  228 projects

Candidate predictors p  Unadjusted FP p  Language types p  Development types p  Platform types p  Domain Sector types

As same as Mensura 2012

Experiments

Mensura  2015  in  Kracow,  Poland 8

p  Mensura  2012  EbA  vs.  the  revised  EbA  (for  RQ1)  p  Growing  Portfolio  (use  all  past  projects)  vs.  Moving  Windows  (for  RQ2,  RQ3)  

Performance  trend  analysis  

Preference

Preference

Statistical  significance

Statistical  significance

Comparisons  between:

p From  12  to  84  months  (fixed-­‐duration)  p From  20  to  120  projects  (fixed-­‐size)

Results:  fixed-­‐size  windows  with  the  revised  EbA

Mensura  2015  in  Kracow,  Poland 9

8 Sousuke Amasaki and Chris Lokan

20 40 60 80 100 120

Window Size (number of projects)

�10

�5

0

5

Diff

eren

ces

inm

ean

AE

(%)

(a) Di↵erences in mean MAE

(b) Di↵erences in mean MRE

Fig. 1: Results with Fixed-size Window, modified EbA with k = 5

Figure 1 and Table 2 revealed characteristics of moving windows comparedto the growing portfolio:

– With windows of up to 60 projects, MAE showed no significant preferencefor any approach. The line starts below zero and quickly goes above zero(favoring the growing portfolio), but the di↵erence was not significant as shownin Fig. 1(a). MRE showed a similar trend, except that moving windows were

8 Sousuke Amasaki and Chris Lokan

(a) Di↵erences in mean MAE

20 40 60 80 100 120

Window Size (number of projects)

�15

�10

�5

0

5

10

Diff

eren

ces

inm

ean

MR

E(%

)(b) Di↵erences in mean MRE

Fig. 1: Results with Fixed-size Window, modified EbA with k = 5

Figure 1 and Table 2 revealed characteristics of moving windows comparedto the growing portfolio:

– With windows of up to 60 projects, MAE showed no significant preferencefor any approach. The line starts below zero and quickly goes above zero(favoring the growing portfolio), but the di↵erence was not significant as shownin Fig. 1(a). MRE showed a similar trend, except that moving windows were

p  GP was advantageous in smaller window sizes but not significant p  MW got significantly advantageous in medium window size

Num of Neighbors k = 5

Results:  comparisons  between  the  old  and  the  revised  EbA

Mensura  2015  in  Kracow,  Poland 10

8 Sousuke Amasaki and Chris Lokan

20 40 60 80 100 120

Window Size (number of projects)

�10

�5

0

5

Diff

eren

ces

inm

ean

AE

(%)

(a) Di↵erences in mean MAE

(b) Di↵erences in mean MRE

Fig. 1: Results with Fixed-size Window, modified EbA with k = 5

Figure 1 and Table 2 revealed characteristics of moving windows comparedto the growing portfolio:

– With windows of up to 60 projects, MAE showed no significant preferencefor any approach. The line starts below zero and quickly goes above zero(favoring the growing portfolio), but the di↵erence was not significant as shownin Fig. 1(a). MRE showed a similar trend, except that moving windows were

8 Sousuke Amasaki and Chris Lokan

(a) Di↵erences in mean MAE

20 40 60 80 100 120

Window Size (number of projects)

�15

�10

�5

0

5

10

Diff

eren

ces

inm

ean

MR

E(%

)(b) Di↵erences in mean MRE

Fig. 1: Results with Fixed-size Window, modified EbA with k = 5

Figure 1 and Table 2 revealed characteristics of moving windows comparedto the growing portfolio:

– With windows of up to 60 projects, MAE showed no significant preferencefor any approach. The line starts below zero and quickly goes above zero(favoring the growing portfolio), but the di↵erence was not significant as shownin Fig. 1(a). MRE showed a similar trend, except that moving windows were

Num of Neighbors k = 5

p  Trends were same but effective sizes and ranges were different p  Trends were same but effective sizes and ranges were different p  The best k moved from k=2 (Mensura 2012) to k=5 p  Trends were same but effective sizes and ranges were different p  The best k moved from k=2 (Mensura 2012) to k=5 p  The improvement by MW was clearer in statistical significance

Results:  fixed-­‐duration  windows  with  the  revised  EbA

Mensura  2015  in  Kracow,  Poland 11

12 Sousuke Amasaki and Chris Lokan

20 30 40 50 60 70 80

Window Size (calendar months)

�10

�5

0

5

Diff

eren

ces

inm

ean

AE

(%)

(a) Di↵erences in mean MAE

(b) Di↵erences in mean MRE

Fig. 2: Results with Fixed-duration Windows, EbA with k = 5

growing portfolio are larger with EbA than with LR, and the range of durationsfor which windows are advantageous is narrower with EbA than with LR. Thedi↵erence in advantageous window sizes and their number between EbA andLR were reported in [4]. These observations were common between this studyand [4].

12 Sousuke Amasaki and Chris Lokan

(a) Di↵erences in mean MAE

20 30 40 50 60 70 80

Window Size (calendar months)

�15

�10

�5

0

5

Diff

eren

ces

inm

ean

MR

E(%

)(b) Di↵erences in mean MRE

Fig. 2: Results with Fixed-duration Windows, EbA with k = 5

growing portfolio are larger with EbA than with LR, and the range of durationsfor which windows are advantageous is narrower with EbA than with LR. Thedi↵erence in advantageous window sizes and their number between EbA andLR were reported in [4]. These observations were common between this studyand [4].

p  GP was advantageous in smaller window sizes but not significant p  MW got significantly advantageous in medium window size p  Less significant window sizes than fixed-size windows

Num of Neighbors k = 5

Results:  comparison  to  the  past  study  [IST2014]

Mensura  2015  in  Kracow,  Poland 12

12 Sousuke Amasaki and Chris Lokan

20 30 40 50 60 70 80

Window Size (calendar months)

�10

�5

0

5

Diff

eren

ces

inm

ean

AE

(%)

(a) Di↵erences in mean MAE

(b) Di↵erences in mean MRE

Fig. 2: Results with Fixed-duration Windows, EbA with k = 5

growing portfolio are larger with EbA than with LR, and the range of durationsfor which windows are advantageous is narrower with EbA than with LR. Thedi↵erence in advantageous window sizes and their number between EbA andLR were reported in [4]. These observations were common between this studyand [4].

12 Sousuke Amasaki and Chris Lokan

(a) Di↵erences in mean MAE

20 30 40 50 60 70 80

Window Size (calendar months)

�15

�10

�5

0

5

Diff

eren

ces

inm

ean

MR

E(%

)(b) Di↵erences in mean MRE

Fig. 2: Results with Fixed-duration Windows, EbA with k = 5

growing portfolio are larger with EbA than with LR, and the range of durationsfor which windows are advantageous is narrower with EbA than with LR. Thedi↵erence in advantageous window sizes and their number between EbA andLR were reported in [4]. These observations were common between this studyand [4].

Num of Neighbors k = 5

p  Overall trend was same between the two studies p  Fixed-size windows was more effective than fixed-duration p  The effective window size became larger and its range is narrower

Answers  to  RQs

13 Mensura  2015  in  Kracow,  Poland

The  change  in  estimation  method  made  a  difference,  improving  the  accuracy  of  estimates.  

RQ1. Reconfirmation of Mensura 2012 results

The  fixed-­‐duration  windows  can  make  a  difference,  and  effective  to  improve  estimation  accuracy.  

RQ2. Evaluation of Fixed-Duration Windows

RQ3. Comparison between window policies

The  fixed-­‐size  and  fixed-­‐duration  window  policies  can  lead  to  significantly  better  estimation  accuracy.  But  fixed-­‐size  made  clearer  difference.  

Practical  implications

14 Mensura  2015  in  Kracow,  Poland

This  and  past  studies  showed  its  effectiveness  with  major  effort  estimation  method,  LR  and  EbA.  

1. Moving Windows is effective

This  and  past  studies  showed  clearer  difference  when  using  fixed-­‐size  windows.  Rethink  practitioners’  mind  regarding  reference  projects.  

2. Fixed-size policy looks better for estimation

3. Effective window sizes might be different even among practitioners

EbA  resembles  practitioners’  thinking.  The  fact  that  the  difference  in  options  resulted  in  different  window  ranges  partly  explain  the  difference  among  practitioners  

Threats  to  Validity

Mensura  2015  in  Kracow,  Poland 15

p  The  result  was  based  on  only  ISBSG  dataset  p  It  is  difficult  to  generalize  the  results  

Dataset  

EbA  p  Limited  to  specific  options  

p More  accurate  or  more  realistic  settings  

Conclusion

p  Fixed-­‐duration  windows  works  with  EbA  p  Under  more  realistic  situation  

p  The  results  brought  some  practical  implications  p  ex.  Fixed-­‐size  policy  is  more  suitable  

p  Exploration  of  EbA  options  p  Additional  experiments  on  other  datasets  

16 Mensura  2015  in  Kracow,  Poland

Future Work

Mensura  2015  in  Kracow,  Poland 17

We  welcome  questions  !  

Sousuke  Amasaki:  [email protected]­‐pu.ac.jp                        Chris  Lokan:  [email protected]  

Contact  info: