Characteristics of Class Collaboration Networks in Large Java Software Projects

26
Characteristics of Class Collaboration Networks in Large Java Software Projects Miloš Savić, Mirjana Ivanović, Miloš Radovanović Department of Mathematics and Informatics Faculty of Science University of Novi Sad

description

Characteristics of Class Collaboration Networks in Large Java Software Projects. Miloš Savić, Mirjana Ivanović, Miloš Radovanović Department of Mathematics and Informatics Faculty of Science University of Novi Sad. Content. Class collaboration networks Characteristics of complex networks - PowerPoint PPT Presentation

Transcript of Characteristics of Class Collaboration Networks in Large Java Software Projects

Page 1: Characteristics of Class Collaboration Networks in Large Java Software Projects

Characteristics of Class Collaboration Networks in Large

Java Software Projects

Miloš Savić, Mirjana Ivanović, Miloš Radovanović

Department of Mathematics and InformaticsFaculty of Science

University of Novi Sad

Page 2: Characteristics of Class Collaboration Networks in Large Java Software Projects

Content

• Class collaboration networks• Characteristics of complex networks• Mathematical models of complex networks• Network extraction• Experiments and results• Conclusion

Page 3: Characteristics of Class Collaboration Networks in Large Java Software Projects

Content

• Class collaboration networks• Characteristics of complex networks• Mathematical models of complex networks• Network extraction• Experiments and results• Conclusion

Page 4: Characteristics of Class Collaboration Networks in Large Java Software Projects

Class Collaboration Networks- Definition -

• Software – complex, modular, interacting system

• Java Class Collaboration Networks:* nodes – classes/interfaces* links – interactions among classes/interfaces

• Interaction ↔ Reference* Class A instantiates and/or uses objects of class B* Class A extends class B* Class A implements interface B

Page 5: Characteristics of Class Collaboration Networks in Large Java Software Projects

interface A { … }

class B implements A { … }

class C {

public void methodC(B b) {

b.someMethod();

}

}

class D extends C implements A {

public B makeB() { return new B(); }

}

C D

A

B

Class Collaboration Networks- Example -

Page 6: Characteristics of Class Collaboration Networks in Large Java Software Projects

Content

• Class collaboration networks• Characteristics of complex networks• Mathematical models of complex networks• Network extraction• Experiments and results• Conclusion

Page 7: Characteristics of Class Collaboration Networks in Large Java Software Projects

Characteristics of complex networks- Degree distribution -

• Node degree: number of links for the node

• Distribution function P(k)* probability that a randomly selected node has exactly k links

• Directed graph: incoming and outgoing degree distributions

A

B C

DE

Page 8: Characteristics of Class Collaboration Networks in Large Java Software Projects

Characteristics of complex networks - Small world property -

• Relatively short path between any two nodes

• L ~ ln(N) – small world phenomena

• L ~ lnln(N) - ultra small world phenomena

1

2

3

4

5

6

7

nlL

nll

n

ii

n

ijj

iji

/

)1/(

1

1

l15=2 [125]

l17=4 [1346 7]

Page 9: Characteristics of Class Collaboration Networks in Large Java Software Projects

Characteristics of complex networks - Clustering coefficient -

• Tendency to cluster

• Node i- ki links to ki nodes (neighbours)- Ei – number of links between neighbours

• Neighbours with node i forms complete subgraph Ci = 1

i

3

1

23*4

2

2

)1(

ii

ii kk

EC

Page 10: Characteristics of Class Collaboration Networks in Large Java Software Projects

Content

• Class collaboration networks• Characteristics of complex networks• Mathematical models of complex networks• Network extraction• Experiments and results• Conclusion

Page 11: Characteristics of Class Collaboration Networks in Large Java Software Projects

Mathematical models of complex networks

• Erdőos-Rényi /ER/ modelrandom networks

• Barabási-Albert /BA/ modelscale-free networks

Page 12: Characteristics of Class Collaboration Networks in Large Java Software Projects

Mathematical models of complex networks- ER model -

Alg: Generate ER network

Input: p – connection probability [0..1]n – number of nodes

Output: ER network

for (i = 1; i < n; i++) for (j = 0; j < i; j++) if (p <= rand(0, 1)) Connect(i, j);

Page 13: Characteristics of Class Collaboration Networks in Large Java Software Projects

Mathematical models of complex networks- BA model -

• Start with small random graph

• Growth * in each iteration add new node with m links

• Preferential attachment * new node prefers to link to highly connected nodes

jj

ii k

kk

)( the probability that the new node connects

to a node with k links is proportional to k

Page 14: Characteristics of Class Collaboration Networks in Large Java Software Projects

kkP ~)(

1. The most of real/engineered networks are scale-free and can be modeled by BA model and its modifications

2. Both models can produce small world property

3. Clustering coefficient of scale-free network is much larger than in a comparable random network

Mathematical models of complex networks- BA model -

Page 15: Characteristics of Class Collaboration Networks in Large Java Software Projects

Content

• Class collaboration network• Characteristics of complex networks• Mathematical models of complex networks• Network extraction• Experiments and results• Conclusion

Page 16: Characteristics of Class Collaboration Networks in Large Java Software Projects

Network Extraction

• Class diagrams/JavaDoc/Source code

• YACCNE* Jung, JavaCC

• Node connecting rules

1. Class A gives an incoming link to class B if A imports B2. Class A gives an incoming link to class B if B is in the same package as A, and A references B3. Class A gives an incoming link to class B if A references B through it’s full package path 4. References that come outside the software system are excluded

Page 17: Characteristics of Class Collaboration Networks in Large Java Software Projects

Content

• Class collaboration network• Characteristics of complex networks• Mathematical models of complex networks• Network extraction• Experiments and results• Conclusion

Page 18: Characteristics of Class Collaboration Networks in Large Java Software Projects

Experiments and results- Experiments -

• JDK, Tomcat, Ant, Lucene, JavaCC- cumulative incoming/outgoing link degree distributions- small-world coefficient- clustering coefficient

• Ten successive versions of Ant (from 1.5.2 to 1.7.0)- compared- can preferential attachment rule model Ant evolution?

Page 19: Characteristics of Class Collaboration Networks in Large Java Software Projects

Experiments and results- JDK -

Our work (Valverde and Solé, 2003)

γ[in] 2.17493 2.18

γ[out] 3.63214 3.39

Small-world coefficient 4.391 5.40

Clustering coefficient 0.453 0.225

Extraction method Source code Class diagrams

Page 20: Characteristics of Class Collaboration Networks in Large Java Software Projects

Class collaboration network

γ[in] R2 γ[out] R2

JDK 2.17493 0.9541 3.63214 0.9667

Ant 2.05001 0.9927 3.93654 0.9281

Tomcat 2.35234 0.9294 3.5026 0.9499

Lucene 1.98075 0.9050 4.29761 0.9028

JavaCC 2.26362 0.8946 2.20816 0.9656

γ[in] < γ[out] (except JavaCC) Same result for variuos CCNs: Myers(2003), Valverde and Solé, 2003

Experiments and results- In/Out Degree distributions -

Page 21: Characteristics of Class Collaboration Networks in Large Java Software Projects

Experiments and results- Small world and clustering coefficient -

#nodes #links l c c[rand]

JDK 1878 12806 4.391 0.453 0.0036

Ant 778 3634 4.131 0.505 0.006

Tomcat 1046 4646 1.909 0.464 0.0042

Lucene 354 2221 2.2778 0.386 0.0177

JavaCC 79 274 1.22 0.437 0.0439

l[Tomcat] ~ lnln(N[Tomcat])l[JavaCC] ~lnln(N[JavaCC])c >> c[rand]

Page 22: Characteristics of Class Collaboration Networks in Large Java Software Projects

Experiments and results- Ant CCN Evolution -

org.apache.tools.ant.Project

org.apache.tools.ant.BuildException

org.apache.tools.ant.Task

1.5.4: 536 nodes, 2241 links1.6.0: 114 new nodes, 525 new links

(336, 63)

(220, 43)

(124, 22)

Page 23: Characteristics of Class Collaboration Networks in Large Java Software Projects

Experiments and results- Ant CCN Evolution -

1.6.5: 690 nodes, 3000 links1.7.0: 132 new nodes, 44 deleted nodes, 634 new links

org.apache.tools.ant.Project

org.apache.tools.ant.BuildException(417, 69)

(269, 44)

Page 24: Characteristics of Class Collaboration Networks in Large Java Software Projects

Content

• Class collaboration network• Characteristics of complex networks• Mathematical models of complex networks• Network extraction• Experiments and results• Conclusion

Page 25: Characteristics of Class Collaboration Networks in Large Java Software Projects

Conclusion

• Analyzed networks exhibit scale-free (or nearly scale-free) and small-world properties.

• The preferential attachment concept introduced in the BA model can explain Ant’s class collaboration network evolution

Page 26: Characteristics of Class Collaboration Networks in Large Java Software Projects

Characteristics of Class Collaboration Networks in Large

Java Software Projects

Miloš Savić, Mirjana Ivanović, Miloš Radovanović

Department of Mathematics and InformaticsFaculty of Science

University of Novi Sad