Metric-Driven Approach to Benchmarking Model Correctness...
Transcript of Metric-Driven Approach to Benchmarking Model Correctness...
Ben-Gurion University of the Negev
Faculty of Natural Sciences
Department of Computer Science
Metric-Driven Approach to Benchmarking Model
Correctness Algorithms
by Victor Makarenkov
Supervised by: Prof. Mira Balaban
THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR M.Sc DEGREE
July 2011
Abstract
This thesis presents a metric based automatic benchmark creation method. The thesis
provides patterns of model based metrics, and an implemented method for translating these
patterns to Alloy and automatically create benchmark models.
This research was motivated by a study of �nite satis�ability of class diagrams, extending
the FiniteSat algorithm, to support the quali�er constraint. This extension of FiniteSat
algorithm, was also developed within this thesis. Further, during the research of practical
occurrence and relevance of correctness problems within class diagrams, a problem of manual
creation of class diagrams for experiments was met.
Manual creating of benchmark models motivated the rest of the thesis. In order to
evaluate algorithms operating on class diagrams correctness problems, a problem sample is
needed. Creating such a sample for a benchmark need is not a simple task. Since every
algorithm needs di�erent models for its evaluation such as �nding strengths and weaknesses,
an automatic only way is needed for creating the models. The major contribution of this
thesis is to the following topics:
− Analytical evaluation of model metrics.
− Classi�cation of metrics into patterns.
− Development of language patterns for description of every metric pattern.
− Showing an algorithm for translation of each metric pattern into Alloy.
II
Acknowledgments
I am deeply grateful to Professor Mira Balaban for guiding me through the research topic.
Professor Balaban taught me a lot about engineering processes in the �eld of software en-
gineering, provided good portions of motivation and advise on how a research in computer
science must be done. Professor Balaban was absolutely patient to me, at the end giving me
an opportunity to learn not only research topics in software engineering, but also far beyond
strictly scienti�c subjects.
I would like to thank a very good friend of mine - Azzam Maraee. During both the most
exciting and most di�cult periods in the last years, he was always supporting, giving me
endless professional and personal help. He contributed enormous amount of ideas and in-
sights in this thesis.
Lastly, and most importantly, I wish to thank my parents Andrey and Yelena, my wife
Nataly, for their love, very constant support and believing in my success.
III
Table of Contents
1 Introduction 1
I Finite Satis�ability of Class Diagrams 6
2 Background 7
2.1 Class Diagrams: Syntax and Semantics . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Correctness of Class Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1 Inconsistency and lack of �nite satis�ability . . . . . . . . . . . . . . . 15
2.2.2 Detection and Identi�cation of Finite Satis�ability . . . . . . . . . . . 15
2.3 FiniteSat Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 FiniteSat Algorithm : Extension to Quali�er 26
3.1 Quali�er Explained . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Discussion on Quali�er Semantics . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 FiniteSat Extension for Quali�er Constraint . . . . . . . . . . . . . . . . . . . 32
3.3.1 Correctness and Complexity of FiniteSat . . . . . . . . . . . . . . . 34
4 Practical Occurrence of Finite Satis�ability in Class Diagrams 47
4.1 Class Diagram's Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2 Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3 Experiment 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.4 Experiment 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
II Metrics 55
5 Metrics 57
5.1 Weyuker's Characterization of Metrics Properties . . . . . . . . . . . . . . . . 58
IV
Table of Contents
5.2 Object Oriented Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.1 Bunge's de�nition of object complexity . . . . . . . . . . . . . . . . . . 59
5.2.2 Chidamber and Kemerer Metrics and Evaluation . . . . . . . . . . . . 60
5.3 Metrics for Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.3.2 Weyuker's Properties Adaptation for Models Metrics . . . . . . . . . . 64
5.3.3 Metrics Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3.4 Related Work on Model Metrics . . . . . . . . . . . . . . . . . . . . . . 74
6 Benchmarking 79
6.1 Metric-Driven Benchmark Creation . . . . . . . . . . . . . . . . . . . . . . . . 79
6.1.1 Metrics as means for algorithm evaluation . . . . . . . . . . . . . . . . 79
6.1.2 Brute-Force Benchmark Creation Without Abstraction . . . . . . . . . 81
6.2 Benchmark Creation via Model Checking . . . . . . . . . . . . . . . . . . . . 81
6.2.1 Introduction to Alloy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.2.2 Generating Models from Meta-Model Metrics with Alloy . . . . . . . . 83
6.3 Automation: A Language For Metrics Values De�nition . . . . . . . . . . . . 86
6.3.1 Metrics classi�cation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7 Conclusions and Future Work 98
A Reasoning Infrastructure Implementation 100
A.1 Implementation of FiniteSat Algorithm . . . . . . . . . . . . . . . . . . . . . . 100
A.2 Implementation in Detail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
A.3 Structural Architecture and Conclusions . . . . . . . . . . . . . . . . . . . . . 104
V
List of Figures
1.1 A Class Diagram with a Finite Satis�ability Problem . . . . . . . . . . . . . . 4
2.1 UML Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Binary multiplicity constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Binary Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Unsatis�ability due to multiplicity constraint con�ict . . . . . . . . . . . . . . 18
2.5 The digraph representation of a binary association . . . . . . . . . . . . . . . 21
2.6 Unconstrained Hierarchy Structures . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1 Visual Quali�er Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 A binary association example. . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Modeling a Unix �le system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Modeling an array with quali�er. . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5 A general quali�er constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.6 A general quali�er constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.7 Example of multiplicities bounds under di�erent semantic interpretations.
(a) Shows unsuitable situation for universal interpretation. (b) Shows the
unconstrained model with zero lower bound. . . . . . . . . . . . . . . . . . . 31
3.8 A TV-Network with broadcast schedule example . . . . . . . . . . . . . . . . 32
3.9 A reduced quali�er constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.10 A class diagram reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.11 The Γ′ mapping of a CD′ instance to a CD instance . . . . . . . . . . . . . . 38
3.12 Γ′ Mapping of instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.13 The Γ Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.14 Γ′ Mapping on di�erent A objects. . . . . . . . . . . . . . . . . . . . . . . . . 43
3.15 Γ′ Mapping on the same A object. . . . . . . . . . . . . . . . . . . . . . . . . 43
3.16 Γ′ Mapping on the same A object and same B object. . . . . . . . . . . . . . 44
VI
List of Figures
4.1 Appearing FiniteSat on 50 Classes Diagram . . . . . . . . . . . . . . . . . . . 53
4.2 Appearing FiniteSat on 100 Classes Diagram . . . . . . . . . . . . . . . . . . 54
5.1 A class diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 Class Diagram: CD1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.3 Class Diagram: CD2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 Class Diagram: CD3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.5 CD1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.6 CD2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.7 Class Diagram: CD3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.8 Class Diagram: CD1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.9 Class Diagram: CD2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.10 Class Diagram: CD3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.11 Merging of Class Diagram CD1 and CD3 . . . . . . . . . . . . . . . . . . . . 74
5.12 An object oriented metamodel . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.13 Extension to the UML 2.0 metamodel. This UML package diagram shows
the de�nition of the CK metrics as a separate package, with a dependency on
classes from the UML metamodel. . . . . . . . . . . . . . . . . . . . . . . . . 77
5.14 NOC Metric De�nition. This OCL code de�nes the NOC metrics from the
CK metrics suite, and is part of a larger de�nition of the whole CK metric
suite which we have implemented using dMML. . . . . . . . . . . . . . . . . . 78
6.1 Class Diagram Meta-Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
6.2 A partial meta-model of UML in Alloy Analyzer. . . . . . . . . . . . . . . . . 85
6.3 Instance �nding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.4 Instance of the speci�ed metamodel . . . . . . . . . . . . . . . . . . . . . . . . 86
6.5 Tree view of Instance of the speci�ed metamodel . . . . . . . . . . . . . . . . 87
6.6 Example of UML meta-model with two elements : X and Y . . . . . . . . . . 88
6.7 Example of Alloy-written meta-model with two elements : X and Y . . . . . . 88
6.8 A Generated instance of the meta-model . . . . . . . . . . . . . . . . . . . . . 89
6.9 Example of UML meta-model with two elements : X and Y . . . . . . . . . . 90
6.10 Example of meta-model with two elements : X, Y with 1:2 ratio between them 91
6.11 Example of partial UML meta-model in Alloy . . . . . . . . . . . . . . . . . . 92
6.12 A Generated model where each class has exactly one sub class. . . . . . . . . 93
6.13 Example of meta-model with three elements : X, Y and Z . . . . . . . . . . . 94
6.14 Example of meta-model with three elements : X, Y and Z . . . . . . . . . . . 94
6.15 A Generated instance of the meta-model . . . . . . . . . . . . . . . . . . . . . 95
VII
List of Figures
6.16 A Generated instance of the meta-model . . . . . . . . . . . . . . . . . . . . . 96
A.1 Reasoning Tool Internal Structure . . . . . . . . . . . . . . . . . . . . . . . . 102
A.2 The structural architecture of reasoning tool . . . . . . . . . . . . . . . . . . 104
A.3 Class diagram of our tool's static structure . . . . . . . . . . . . . . . . . . . . 105
VIII
List of Tables
2.1 UML Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 The Scope of The FiniteSat Algorithm . . . . . . . . . . . . . . . . . . . . . 25
4.1 Existence Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Scalability Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . 52
IX
Chapter 1
Introduction
The Uni�ed Modeling Language (UML) is nowadays the industry standard modeling frame-
work, including multiple visual modeling languages, referred to as UML models. Tradition-
ally, UML models are used for analysis and design of complex systems and now are starting
being interleaved with most serious Integrated Development Environments (IDEs) such as
open source Eclipse [34] IDE and most common IDEs that come from industry proprietary
vendors. Their relevance has increased with the advent of the Model-Driven Development
(MDD) approach, in which analysis and design models play an essential role in the process
of software development. Recently, with the emergence of web-enabled agent technology,
UML models are used also for ontology representation, and construction and extraction of
ontologies [40, 8, 28].
In view of their wide popularity, it is highly important that UML models provide reliable
support for the designed systems, and be subject to stringent quality assurance and quality
control criteria [88]. Indeed, an extensive amount of research e�orts is devoted to formal-
ization of UML models, speci�cation of their semantics, and development of reasoning and
correctness checking methods [15, 74]. Moreover, with the prevalence of the Model Driven
Engineering approach, it is expected that all information in a design model will be e�ective
in its successive models.
1
Chapter 1: Introduction
Modeling problems usually arise when models are scaled to model large, distributed
applications. A model may originate from di�erent sources and a large number of designers
can be involved in the modeling process. Designers are highly prone to making mistakes,
and combining information from di�erent sources gives rise to potential con�icts [17, 44, 23].
[53] shows that defects often remain undetected, even if the model is read attentively by
practitioners.
It is highly important that models are tested for correctness, and that problems are
detected as early as possible in the software design process. Nevertheless, current case tools
do not support reasoning about UML models, and enable the construction of erroneous ones.
Furthermore, implementation languages still do not enforce design level constraints. Hence,
there is an urgent need for reasoning methods for detecting analysis and design problems.
The number of algorithms dealing with models is constantly growing:
− General errors recognition.
− Detection of the reason for an error.
− Transformation and improvements of models.
− Classifying models into patterns, see [12].
− Developing special data structures for models and support of model querying.
Due to increasing use and importance ascribed to models, it is crucial to develop means
for examining models complexity from di�erent points of view, and comparing models.
Similarly to customary software evaluating and comparing methods, where complexity, im-
plementation size and probability for bugs are measured - metrics are being developed along
to techniques for evaluating metrics suitability.
Unfortunately, there is no mechanism for evaluating a metrics suite for models. This
thesis contributes to the development of methods for evaluating metrics for models, similarly
to what Weyuker [89] proposed for sequential programs.
2
Chapter 1: Introduction
In order to examine and compare algorithms, implementations, scalability and problems
that algorithms come to solve - benchmarks needed. For databases and software there
are real benchmarks: large software systems or synthetic creation that serve as agreed
benchmarking problems. However, for large models in general, and especially those which
are using extensively in modeling constraints there are no large applications.
The main question addressed in this work is how to create benchmarks properly. The
approach presented uses metrics for examining models complexity and algorithms for bench-
mark creation. For example:
− Creating class diagrams with complex class hierarchy structures, such as deep inheri-
tance trees.
− Creating class diagrams with di�erent ratio value between di�erent constraints number
imposed on class diagram.
Using metrics for creating benchmarks is not straightforward. The key for independent
and fast creation, not using any special application, is an abstraction. There is a need to use
meta-model for de�ning the metrics. Then a model checker is used for creating instances of
a meta-model:
− Input: speci�cation of correctness conditions and the model.
− If the conditions do not hold - a counter example is generated.
This method is used in this thesis for benchmarks generation, speci�cally with Alloy model
checker.
The problem of �nite satis�ability has been addressed and studied in the context of various
kinds of conceptual schemata [24, 42, 45, 56, 86] and in the context of description logics [22].
3
Chapter 1: Introduction
The problem was studied in context of class diagrams recently [61, 9, 64] showing an appli-
cation of �nite satis�ability recognition and detection when UML class diagram involves the
combination of cardinality constraints, class hierarchy constraints, and generalization sets
constraints. Class diagram is �nitely satis�able1 if it has a �nite and non-empty instance.
The example below shows a �nite satis�ability problem.
Figure 1.1: A Class Diagram with a Finite Satis�ability Problem
Figure 1.1, presents a multiplicity constraint cycle that involves a compound class, Grad-
uate, whose instances must be related to Academic instances. Therefore, the number of
student-advisor links in every diagram instance must be both, |G| · 1 and |A| · 2, assuming
that |G| and |A| are the number of graduates and academics, respectively. Therefore, the
extensions of Graduate and Academic must satisfy |G| = |A| · 2, while the Graduate exten-
sion is a subset of the Academic extension, and therefore |G| 6 |A|. This constraint can be
satis�ed only by empty or in�nite extensions. Such problems are termed �nite satis�ability
problems.
In order to check the relevance of the problem, the need for problem benchmarks had
arisen. To check this method, we start with application of metric driven benchmark creation
for the �nite satis�ability of class diagram problem: A metric suite is proposed, and its
correspondence to problems evaluation is examined: relevance of �nite satis�ability problem
and the scalability of the FiniteSat algorithm.
This thesis started with an attempt to extend the scope of FiniteSat algorithm intro-
duced by Balaban and Maraee [11] to handle Quali�er constraint [65], and to implement and
examine the existing algorithm for scalability, and the problem for relevance [59]. During
1and thus correct
4
Chapter 1: Introduction
the implementation it appeared that the direct approach to the problem, without any ab-
straction, forces a change of the implementation after any change in the metrics suite. After
looking for a way for a useful abstraction for a metric suite , in order to express metrics of
a model, this process led to using a model checker. The major contribution of this thesis is
to the following topics:
− Analytical evaluation of model metrics.
− Classi�cation of metrics into patterns.
− Development of language patterns for description of every metric pattern.
− Showing an algorithm for translation of each metric pattern into Alloy.
The thesis is organized as follows: In the �rst part, chapter 2 presents the �nite sat-
is�ability notion, summarizes relevant methods for detection and identi�cation of �nite
satis�ability problems in class diagrams. Finally, the chapter presents the FiniteSat al-
gorithm, introduced by Balaban and Maraee [11] which plays central role in this work. In
Chapter 3 we present polynomial time algorithm for extending the FiniteSat algorithm to
handle quali�er constraint. Chapter 4 presents the initial exploration of �nite satis�abil-
ity practical occurrence within class diagrams. In the second part: chapter 5 presents the
relevant background to model metrics, while Chapter 6 deals with a de�nition of a metric
language that can be also used for automatic metric driven model speci�cation and gener-
ation of a benchmark problem sets. We demonstrate this automatic generation with Alloy
[46] implementation. Finally Chapter 7 concludes this work, and draws the line for future
research. See Appendix A for details of implementation of algorithms described in this work.
In particular the implementation used for experiments reported in Chapter 4.
5
Part I
Finite Satis�ability of Class Diagrams
6
Chapter 2
Background
The Uni�ed Modeling Language (UML) is now the standard graphical modeling language
developed and adopted by the Object Management Group for specifying, visualizing, con-
structing, and documenting the artifacts of software systems, as well as for business modeling
and other non-software systems [72]. UML simpli�es the complex process of software design
by raising the level of abstraction throughout the analysis and design process. Their rele-
vance has increased with the advent of the Model Driven Development (MDD) approach,
in which analysis and model design play an essential role in the process of software devel-
opment. Recently, with the emergence of web-enabled agent technology, UML models are
used also for ontology representation, construction and extraction.
A central assumption that underlined the development of UML was the idea that it is
not possible to describe a complex system with a single model only. A "rich" description of a
system must include a number of highly detailed models. UML consists of twelve diagrams
referred to as UML models. Table 2.1 summarizes the UML diagrams and the modeling
view of software solutions represented by them (extracted from [88]).
7
Chapter 2: Background
UML Diagrams Represent
Use case functionality from the user's viewpointActivity the �ow within a Use case or the systemClass classes, entities, business domain, database
Interaction overview Interaction overview interactions at a general high levelCommunication interactions between objects
Object objects and their linksState machine the run-time life cycle of an object
Composite structure component or object behavior at run-timeComponent executables, linkable libraries, etc.Deployment hardware nodes and processorsPackage subsystems, organizational unitsTiming time concept during object interactions
Table 2.1: UML Diagrams
2.1 Class Diagrams: Syntax and Semantics
Among the twelve visual UML models, class diagrams are probably the most important and
best understood among all UML models. UML class diagrams are used to specify, visualize,
and document the system static view. They also serve as a basis for generating imple-
mentation artifacts such as code skeleton and database schemata, as a means for knowledge
representation such as specifying ontologies, and for de�ning meta-models of other program-
ming, modeling, and speci�cation languages. The origin of the class diagram model is the
conceptual models of the 80's, like Entity-relationship (ER) diagrams [25], their Enhanced
versions (EER), Object-Role Modeling (ORM) diagrams [41], and Frames structured mod-
eling in arti�cial intelligence [70]. The UML class diagram model includes elements from all
these models.
A class diagram is a structural abstraction of a real world phenomenon. The model
consists of basic elements, descriptors and constraints. The basic elements are classes and
associations, the descriptors are class and association attributes, and the constraints are
restrictions imposed on these elements. The constraints are (1) multiplicity constraints on
associations (also termed cardinality constraint), with or without quali�ers; (2) association
8
Chapter 2: Background
class constraint ; (3) class and associations hierarchy constraints; (4) generalization set con-
straints; (5) association constraints; (6) aggregation constraints; (7) multiplicity constraints
on attributes. The syntax and informal semantics are described in Rumbaugh et al [77] and
in OMG-UML [72]. As opposed to computer programs, the class diagrams of a system are
partial. That is, if an attribute is absent on a certain class within class diagram, does not
mean that it does not exist in this class.
Figure 2.1 is an example of a class diagram, which partially speci�es a university system.
It captures the people hierarchy within the university and their relationship to the university
courses. Classes are represented by rectangles; associations are represented by lines between
the rectangles; the quali�ers are presented by small rectangles attached to the end of an
association ends; n-ary association is an association among three or more classes, it is shown
as a large diamond, with a line from the diamond to each participant class; multiplicity
constraints are marked on the association's line ends; association classes are marked by a
dashed line connecting a class rectangle with an association line; class hierarchy constraints
are marked by empty arrow heads; association hierarchies are marked by a dashed arrow
labeled "subset" between association lines; aggregations is a special form of binary asso-
ciation. It is presented by a hollow diamond adornment on the end of an association line
at which it connects to the aggregate class. If the aggregation is a composition, then the
diamond is �lled.
The standard set theoretic semantics of class diagrams associates a class diagram with
class diagram instances in which classes have extensions that are sets of objects that share
structure and operations, and associations have extensions that are relationships among
their end class extensions. We denote class diagrams as CD, class symbols as C, association
symbols as R, role symbols as rn and instance symbols as I. The extension I of symbol T of
CD is denoted T I . Henceforth, we shorten expressions like "instance of an extension of C"
by "instance of C" and "instance of an extension of R" by "instance of R". For example, in
Figure 2.1, the Academic class represents a set of academic people in a university, the binary
9
Chapter 2: Background
Figure 2.1: UML Class Diagram
association between FacultyMember and Course denotes a set of pairs of FacultyMembers
and Courses in which the FacultyMember plays the role of a teacher. The ternary association
between FacultyMember, Course and Student denotes a 3-tuple of values, one from each of
the respective classes.
Constraints are used to restrict the otherwise unrestricted extensions of the class dia-
gram elements. Constraints provide an essential means of knowledge engineering, since they
extend the expressiveness of diagrams. That is:
− Class and association constraints: restrict the set and relationship extensions of
classes and associations, respectively.
− Attribute constraints restrict attribute values in terms of types and multiplicity.
A legal instance of a class diagram is an instance that satis�es all constraints.
10
Chapter 2: Background
The semantics of class diagram constraints:
1. Binary cardinality constraints on binary associations: A binary cardinality
constraint (also termed "multiplicity constraint") is symbolically denoted:
R(rn1 : C1[ min1, max1], rn2 : C2[ min2, max2]) (2.1)
The multiplicity constraint [min1,max1] that is visually written on the rn1 end of the
association line is actually a participation constraint on instances of C2. It states that
an instance of C2 can be related via R to n instances of C1, where n lies in the interval
[min1,max1]. For example, according to Figure 2.1, an Academic must advise at least
two Graduates (as indicated by the 2..* multiplicity constraint). Formally: In every
instance I : for every e1 ∈ C1I ,min2 6 |{e2|(e1, e2) ∈ RI}| 6 max2.
Figure 2.2: Binary multiplicity constraint
2. N-ary association multiplicity constraints: Multiplicity constraints are set in n-
ary association R between the classes C1, ..., Cn and the roles rn1, ...rnn respectively
is symbolically denoted by the following relationship construct:
R(rn1 : C1[m,n1], ..., rnn : Cn[mn, nn]) (2.2)
Multiplicity constraint on n-ary association end (role) [mini,maxi] represents the pos-
sible number of values (objects) of Ci , when the values at the other n-1 ends are �xed.
Consider the ternary association S.F.C in Figure 2.1, A Student will not take the same
Course from more than one FacultyMember, but a Student may take more than one
Course from a single FacultyMember, and a FacultyMember may teach more than one
11
Chapter 2: Background
Course. The cardinality constraint de�ned in the binary association is clear and is set
on all class instances.
3. Quali�er attribute constraints: are optional for a binary association ends roles.
A quali�er constraint distinguishes the set of objects at the far end of the association
based on the quali�er value, and symbolically denoted:
R(rn1 : SourceClass{(q1, T1)...(qn, Tn)}[min1,max1], rn2 : TargetClass[min2,max2])
A quali�er is used within a quali�ed association to relate a quali�ed object to a target
object using a quali�er value that is taken from quali�er domain. The multiplicity on
the target side restricts the number of target objects that can be related to a quali�ed
object. In Figure 2.1, the binary association between FacultyMember and Course is
quali�ed by the quali�er Semester, whose domain is Semesters enumeration. This
says, that quali�ed FacultyMember (a FacultyMemebr-Semester pair) can teach at
most one Course (target class) in the speci�ed Semester (quali�er). In chapter 3 we
discuss the two possible formal interpretations of the quali�er constraint, and the one
that seems to be the preferred UML interpretation (UML semantics is only verbally
speci�ed).
4. Association classes restrict their objects to be uniquely identi�ed by pairs of the
connected association. In Figure 1, every Enrollment object is identi�ed by a unique
course-student pair in 1:1 correspondence (no two enrollments are identi�ed by the
same pair).
5. Aggregation constraint: re�ects whole-part relationships between a class- the as-
sembly to its parts - the components classes. For example, the class University in
Figure 2 is the assembly where the classes MathFaculty and CompFaculty are the
component classes. The aggregation relationship is transitive and asymmetric across
all aggregation links. The asymmetric property of aggregation requires that a part
of an assembly cannot aggregate one of its aggregators (the aggregation relation is
12
Chapter 2: Background
acyclic). Composition is a restricted form of aggregation that describes physical con-
tainment and various notions of ownership.
6. Association constraints: It is also possible to de�ne explicit constraints between
associations:
− a {xor} constraint: is imposed on two or more associations that have a common
end class (base class). An instance of the base class may participate in at most
one association in the constraint. A multiplicity constraint on a xored association
a applies only if the base class participates in a [10].
− Association hierarchy constraints: means inclusion of the association classes.
Whether it means also inclusion of the associations of the association classes is
not clari�ed in the UML2 speci�cation [77]. But, in any case the multiplicity
constraints on the associations are inherited.
7. Class hierarchy constraints: specify subset relations between classes. In Figure 2.1,
in every instance the extensions of the FacultyMember and Graduate classes are subsets
of the Academic extension.
8. Generalization set constraints: Class hierarchy constraints can be grouped into
a Generalization Set (shortly GS ), as shown in Figure 2.1. For example, Graduate-
Course, UnderGraduateCourse and Course form a Generalization Set. In that case,
more constraints can be de�ned on the group. There are two orthogonal planes for
de�ning such constraints: (1) disjointness and (2) completeness. Below are the four
constraints that can be labeled the generalization set:
(a) complete - An instance of the superclass is an instance at least one subclass.
(b) incomplete- There might be instances of the superclass that are not instances of
any subclass.
(c) disjoint- Subclasses extensions are mutually exclusive.
13
Chapter 2: Background
(d) overlapping - Subclasses extensions may overlap.
The GS constraints can be combined to form one of the following valid combina-
tions: {complete, disjoint}, {incomplete, disjoint}, {complete, overlapping}, {incom-
plete, overlapping}. For example, The constraint { overlapping, complete } on the
generalization set Course indicates that a course may be both a graduate and an
undergraduate course (citation overlapping) , and every course is either graduate or
undergraduate (complete).
Def 2.1. An instance I of a class diagram CD, consists of a domain D and an extension
function I that assigns extensions to symbols. For a class symbol C, CI (a shorthand for
I(C)) is a subset of D, and for an association symbol a, aI is a subset of D∏
D.
Def 2.2. A legal instance of a class diagram is a �nite instance where the class and as-
sociation extensions satisfy all constraints in the diagram. Correctness of a class diagram
involves consistency and satis�ability notions, that are discussed in [15, 24, 56, 86]. We
further elaborate this terminology, and suggest additional notions, in order to facilitate a
more accurate de�nition of correctness.
2.2 Correctness of Class Diagrams
Class diagrams are models written by people, and therefore, usually su�er from modeling
problems like inconsistency, redundancy and abstraction errors. Inexperienced designers
tend to create erroneous models, but even experienced ones cannot anticipate the impli-
cation of a change on an overall model. Indeed, Lange et al showed in [53] that model
defects often remain undetected, even if experienced practitioners check the model atten-
tively. These problems are aggravated when a model originates from di�erent resources, as
frequently happens when web services are integrated. Combined sources might overlap, and
the integration might yield redundant inconsistent models [17, 43, 23]. It is a clear that such
problems can best be solved at the level of models rather than during the implementation.
14
Chapter 2: Background
Thus, the need to provide coherent models is appealing. In particular,it is essential to
have tools that can validate quality and correctness of models. Furthermore, models can be
improved, based on given design criteria. The same holds for meta models as they underlay
the modeling of concrete systems. In order to achieve the goal of improving a model quality,
a diversity of reasoning capabilities is required.
2.2.1 Inconsistency and lack of �nite satis�ability
Design quality refers to (1) erroneous models that impose cannot be populated in an accept-
able way, (2) low quality that can be improved according to some design criteria. Reasoning
helps in detecting erroneous models, �nding the source of errors and possibly suggesting
repairs. It is used for revealing redundant situations, and for testing whether design criteria
are met.
Correctness of class diagrams involves two problems: inconsistency and �nite satis�a-
bility. Quality involves : redundancy, design improvement and possibly other problems.
Inconsistency arises when the constraints imposed on a class diagram are contradictory,
meaning there is no legal instance which is not empty. Finite satis�ability is caused by mul-
tiplicity constraints that can be satis�ed by either empty or in�nite class extensions (i.e.,
instantiations). Redundancy appears when constraints seem to allow values or links that
cannot be realized (are inconsistent). Quality improvement deals with changing the models
following various criteria such as design patterns or reuse enhancements.
2.2.2 Detection and Identi�cation of Finite Satis�ability
Class diagram reasoning methods can be classi�ed into concrete reasoning methods that
directly solve speci�c problems [56, 42] and translation based methods that provide reasoning
by mapping UML models into a formal reasoning framework [15, 6, 54] 1. Concrete methods
1A UML class diagram is translated into a formula or expression in some other language, and the trans-lation is proved to be correct. The notion of correctness varies between studies. The formal notion requiresa proof of equivalence, i.e., a proof that the translation preserves all and only the implications of the original
15
Chapter 2: Background
tend to apply to error detection and revealing redundancy, while translation based methods
deal with general query answering a variety of modeling needs.
Concrete Methods for Reasoning about Emptiness of Class Diagrams
Kaneiwa and Satoh [49] study the problem of full consistency in a subset of UML class
diagrams that include classes with typed attributes and multiplicity constraints on the at-
tributes, unconstrained associations and constrained generalization sets. They identify three
factors for inconsistency in such diagrams: (1) combination of generalization with disjoint-
ness; (2) attribute overwriting in multiple hierarchies; and (3) combination of completeness
and disjointness constraints in generalization sets. Based on these factors, they provide
tractable algorithms for deciding full consistency in the restricted class diagram model.
Concrete Methods for Reasoning about Finiteness of Class Diagrams
Reasoning on �niteness of entity relationship and class diagrams has attracted much at-
tention. The problem was independently identi�ed in (Lenzerini and Nobili: [56]) and in
(Thalheim: [86]), and referred to entity relationship diagrams. Later the methods were
extended to various fragments of UML class diagrams. The problem is to detect, identify
cause and suggest repair, to diagrams that are not strongly satis�able.
There are two main approaches: (1) The linear programming approach, and the (2) graph
based approach. The �rst approach reduces the all class �niteness problem to the problem
of �nding a solution to a system of linear inequalities. The second approach detects in�nity
causing cycles in the diagram, and possibly suggests repair transformations. All methods
apply only to fragments of UML class diagrams. Detection of in�nity in unrestricted UML
class diagrams is still an open issue.
class diagram.
16
Chapter 2: Background
The Linear Programming Approach The fundamental method of Lenzerini and Nobily
[56] is de�ned for an entity relationship diagram that includes Entity types (Classes), n-ary
Relationship types (Associations), and Cardinality Constraints2. The method consists of a
transformation of the cardinality constraints into a into a set of linear inequalities whose
variables stand for the sizes (cardinalities) of the entity and relationship types in a possible
instance. A relationship R(rn1 : C1[min2,max2], rn2 : C2[min1,max1] (Figure 2.3) yields
the following inequalities:
− For min2 = 0: r > min2 · c1.
− For max2 = ∞: r 6 max2 · c1.
− For min1 = 0: r > min2 · c1.
− For max1 = ∞: r 6 max2 · c1.
where r, c1, c2, are variables that stand for the sizes of the respective entity or relationship
types. In addition, For every entity or association symbol T , insert the inequality: T > 0.
Figure 2.3: Binary Association
The size of the inequality system is polynomial in the size of the diagram. The main
result is that the entity relationship diagram is fully �nitely satis�able if and only if the
inequalities system has a solution. Since linear programming is solvable in polynomial time
in the size of the problem encoding, full �nite satis�ability for this fragment of class diagrams
can be decided in polynomial time.
2Lenzerini and Nobili (1990) use the membership semantics for cardinality constraints (consult Balabanand Shoval in[? ? ] for semantics of cardinality constraints) for semantics of cardinality constraints). Fornon-binary relationships, this is not the standard semantics of cardinality constraints, neither in the entityrelation model nor in the class diagram model.
17
Chapter 2: Background
Example 2.1. Consider Figure 2.4, each course should have a single successor and at least
two predecessors. The applying of Lenzerini and Nobily method in this example yields the
insolvable inequalities system below:
1. The Variables: c for Course and d for Dependency
2. The System Inequalities:
(a) The Dependency Association Inequalities:
− 1. d > c ∗ 2.
− 2,3. d = c, (d > c and d 6 c).
(b) 4,5. d, c > 0
Figure 2.4: Unsatis�ability due to multiplicity constraint con�ict
Calvanese and Lenzerini, in [24], extend the inequalities based method of [56] to apply
to schemata with class hierarchy constraints. The expansion is based on the assumption
that class extensions may overlap. They provide a two stage algorithm in which the �nite
satis�ability problem of a class diagram with ISA constraints is reduced into the �nite
satis�ability problem of a class diagram without ISA constraints. Then, similarly to [56],
they check satis�ability of the new class diagram by deriving a special system of linear
inequalities (di�erent from that of [56]).
The class diagram transformation process of [24] is fairly complex, and might introduce,
in the worst case, an exponential number, in terms of the input diagram size, of new classes
and associations. The method was further simpli�ed in [20], were class overlapping is re-
18
Chapter 2: Background
stricted to class hierarchy alone. The simpli�cation of [20] reduces the overall number of
new classes and associations, but the worst case is still exponential.
Lenzerini and Nobili [56] were the �rst to suggest a method for cause identi�cation of
strong satis�ability in restricted entity relationship diagrams. Their solution is not construc-
tive, as they do not provide a method for computing critical cycles. A �rst step towards
�nding critical cycles appears in [84].
Dullea and Song [31] and Dullea et al.[32] characterize in�nity causing structures (termed
structural invalidity) of recursive binary and ternary relationship types in entity relation-
ship diagrams. The analysis suggests a set of structure based decision rules for identifying
structural invalidity in entity relationship diagrams.
2.3 FiniteSat Algorithm
Correctness of a class diagram involves consistency and �nite-satis�ability [24, 56, 86, 15, 62].
A class is consistent if it has a non-empty extension in some legal instance; a class diagram
is consistent if all of its classes are consistent; a class is �nitely satis�able if it has a non-
empty extension in some legal �nite instance; a class diagram is �nitely satis�able if all of its
classes are �nitely satis�able3. It can be shown that a consistent class diagram has a legal
instance in which all class extensions are non-empty, and a �nitely satis�able diagram has
a legal instance in which all class extensions are non-empty and �nite [61]. Class diagrams
CD, CD′ are equivalent, denoted CD ≡ CD′, if they have the same legal instances.
Complexity: Berardi et al., in [15], showed that deciding consistency of UML class dia-
grams is EXPTIME-complete. Artale et al. [7] re�ne these results, by considering fragments
of class diagrams. They show that for ER diagrams that include, besides cardinality, class
hierarchy and disjoint constraints, deciding consistency is in NLogSpace. Addition of com-
plete constraints raises the complexity to NP, and addition of association hierarchy has
already the EXPTIME-complete complexity.
3Lenzerini and Nobili [56] used the notion of strong satis�ability for this term.
19
Chapter 2: Background
Recently, it was shown [58, 78] that �nite satis�ability of the description logic ALCQI is
EXPTIME-complete, which implies that �nite satis�ability of class diagrams (under some
minor restrictions) is also EXPTIME-complete.
There are two main approaches for reasoning about �nite satis�ability of class diagrams:
The linear inequalities approach and the graph based approach. The �rst approach reduces
the �nite satis�ability problem to the problem of �nding a solution to a system of linear
inequalities. The second approach detects in�nity causing cycles in the diagram, and pos-
sibly suggests repair transformations. All methods apply only to fragments of UML class
diagrams. Deciding �nite satis�ability in unrestricted class diagrams is still an open issue.
Below, we shortly summarize results in both approaches, on which our research is based.
The fundamental work in the linear inequalities approach is that of [56, 85]. It applies
to Entity-Relationship (ER) diagrams with Entity Types (Classes), Binary Relationships4
(Associations), and multiplicity Constraints.
Calvanese and Lenzerini, in [24], extend the inequalities based method of [56] to apply
to diagrams with class hierarchy constraints, but size of the resulting system of inequalities
is exponential in the size of the class diagram. The simpli�cation of [20] reduces the overall
number of new class and association variables, but the worst case is still exponential.
A method for identi�cation of the cause for non �nite satis�ability was �rst suggested
in [56]. The method is based on construction of a directed graph (digraph) whose nodes
stand for classes and associations, and its edges connect association nodes with their end
class nodes. The edges are weighted by the multiplicity constraints, as shown in Figure 2.5.
The weight of a path is the product of the weights of its edges. The directed graph is
the means for detecting the causes for non �nite satis�ability of a class diagram. Cycles
whose weight is less than 1 are termed critical cycles. They point on non �nite satis�ability.
Moreover, a critical cycle singles out a non-�nitely satis�able set of multiplicity constraints.
Similar approaches are introduced in [86, 42, 44].
4They allow also n-ary relationships, but with non-standard (membership) semantics for cardinalityconstraints.
20
Chapter 2: Background
Figure 2.5: The digraph representation of a binary association
The FiniteSat algorithm presented by Maraee and Balaban [11]:
Algorithm 2.1. The FiniteSat Algorithm
Input: A class diagram CD with binary multiplicity constraints, class hierarchy constraints,
GS constraints.
Output: A linear inequality system ΨCD
Method: Insert a variable for every class and association in CD.
1. For every multiplicity constraint, insert inequalities according to the Lenzerini and
Nobili method (see chapter 2).
2. For every class hierarchy B ≺ A constraint, B being the sub-class with variable b, and
A being the super-class with variable a, add the inequality a > b.
3. For every GS constraint GS(C,C1, ...Cn; Const), C being the super-class with variable
c, Cis being the subclasses with variables ci, and Const being the GS constraint, add
n class hierarchy inequalities c > ci, i = 1, n, and the following inequalities:
− Const = disjoint: c >n∑
j=1cj
− Const = complete: c 6n∑
j=1cj
− Const = incomplete: ∀j ∈ [1, n].c > cj
− Const = overlapping: Without inequality
− Const = disjoint, incomplete: c >n∑
j=1cj.
− Const = disjoint, complete: c =n∑
j=1cj.
21
Chapter 2: Background
− Const = overlapping, complete: c <n∑
j=1cj.
− Const = overlapping, incomplete: ∀j ∈ [1, n].c > cj.
Proving the correctness of the FiniteSat algorithm requires analysis of the structure of
class hierarchies. For that purpose, we consider the graph of class hierarchy constraints alone,
in which nodes represent classes and directed edges represent ISA constraints, directed from
super-classes to their subclasses (association lines are removed). We consider two versions
of such graphs: Directed and undirected. Three class hierarchy structures are analyzed:
1. Tree class hierarchy : The directed graph of the class hierarchy forms a tree, as in
Figure 1.1.
2. Acyclic class hierarchies: The undirected graph of the class hierarchy is acyclic. In
Figure 2.6-a, the directed class hierarchy is not a tree, as F is a sub class of both C
and D, but the undirected class hierarchy graph is acyclic (a tree).
3. Cyclic class hierarchies: The undirected graph of the class hierarchy is cyclic. Multiple
inheritance is unrestricted, as the undirected induced graph can be cyclic. In Figure
2.6-b, class F has two ISA paths to its super-class A. The ISA path A,B, F,C,A
forms an undirected ISA cycle.
Figure 2.6: Unconstrained Hierarchy Structures
The correctness of Algorithm FiniteSat is proved via a reduction of the �nite satis�a-
bility of a class diagram CD to the �nite satis�ability of a class diagram CD′, that does not
22
Chapter 2: Background
include class hierarchies, and therefore, the [56] method applies to it. CD′ is created as fol-
lows: Initialize CD′ by CD. Replace all class hierarchy constraints with new regular binary
associations (termed henceforth ISA associations) between the super-class to the subclasses.
The multiplicity constraints on these associations are 1..1 participation constraint for the
subclass (written on the super class end in the diagram) and 0..1 participation constraint
for the super class. Figure 1.1-b shows the reduced class diagram of Figure 1.1-a.
Lemma 2.1. Finite satis�ability of CD is reducible into the �nite satis�ability of CD′.
Proof. (Sketched) The reduction is de�ned by bi-directional translations between non-empty
�nite legal instances I and I ′ of CD and CD′, respectively. The translations rely on a
mapping T (and its inverse T−1) from I ′ to I, which collapses a structure of ISA-linked
objects in I ′ into a single object in I. The intuition is that CD′ splits a single instance
object of CD into its components in its ancestor classes.
A crucial property of the T translation is that ISA-linked objects in I ′ should not include
two objects from the same class. This property, termed the Single Class property, ensures
that the T mapping maps an instance I ′ of CD′ to an instance I of CD. The main problem
is showing that the mapping preserves multiplicity constraints (otherwise, while collapsed
into a single object in I, the links of two objects are combined into links of a single one).
Full proof in [61].
The reduction is proved by considering the three forms of class hierarchy graphs. For
trees and for acyclic hierarchies, the single class property holds for every instance. For cyclic
class hierarchies, it is shown that if a diagram is �nitely satis�able, then it has an instance
that satis�es the single class property.
Claim 2.2 (FiniteSat correctness � without GS constraints). A class diagram with binary
multiplicity constraints and class hierarchy constraints is �nitely satis�able if and only if the
inequality system constructed by Algorithm FiniteSat is solvable.
23
Chapter 2: Background
Proof. (Sketched) Given a class diagram CD, construct a class diagram CD′ as above,
to which the inequalities method of [56] is applied. Based on Lemma 2.3, CD is �nitely
satis�able if and only if the inequality system of [56] for CD′ is solvable. It is not hard
to show that this inequality system is equivalent to the inequality system constructed by
FiniteSat .
The results of this claim can be extended for class diagrams with GS constraints and
acyclic class hierarchy structure, or cyclic structure in which class hierarchy cycles do not
include disjoint or complete constraints. The scope of the FiniteSat algorithm is de�ned
in the following claim:
Claim 2.3 (Partial correctness � GS constraints, cyclic hierarchy). A cl-
ass diagram with binary multiplicity constraints, class hierarchy constraints, and GS con-
straints, in which class hierarchy cycles include disjoint or complete constraints, is not-
�nitely satis�able if the inequality system constructed by Algorithm FiniteSat is not solv-
able.
Proof. In cyclic class hierarchies, the disjoint or complete GS -constraints might have an
implicit global e�ect on other generalization sets in a cycle. Therefore, if the inequality
system does not have a solution, the corresponding diagram does not have a legal �nite non-
empty instance, but a solution for the inequalities might miss the implicit constraints.
Claim 2.4 (Complexity of the FiniteSat algorithm). The construction of the inequalities
by FiniteSat, and their number is O(n), where n is the number of constraints in the class
diagram.
Proof. Every constraint contributes a constant number of inequalities.
Table 2.2 summarizes the results of the above claims.
24
Chapter 2: Background
Graph Structure With/ Without GS constraints FiniteSat correctness
Acyclic Without correctwith correct
Cyclic Without correctNo disjoint or complete in cycles correctdisjoint or complete in cycles sound for unsatis�ability
Table 2.2: The Scope of The FiniteSat Algorithm
25
Chapter 3
FiniteSat Algorithm : Extension to
Quali�er
Quali�er constraint is not a syntactic sugar. It rather signi�cantly enrich the modeling
capabilities by strengthening multiplicity constraints, providing elegant re�nements on as-
sociations and designing lookup structures in software. Quali�er constraint stands with
multiplicity constraint, thus, it can play central role in causing �nite satis�ability problems
in class diagrams. This chapter deals with �nite satis�ability problem when quali�er con-
straint is imposed on an association. Quali�er constraint on binary association is, generally
speaking, a slot for an attribute or list of attributes, in which the values of the attributes
select a unique related object or a set of related objects from the entire set of objects related
to an object by the association [77, 72, 55].
The quali�er rectangle is part of the association line, not part of the class. The quali�er
Figure 3.1: Visual Quali�er Notation
attached to the class that it quali�es - that is, an object of the quali�ed class, together with
26
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
a value of the quali�er, select a set of target class objects on the other end of the association.
As said before, there may be one or more such attributes. It provides essential detail, the
omission of which would modify the inherent character of the relationship. It is possible for
both ends of a binary association to have quali�ers, but it is rare and as far as is seen in
literature, not practically used.
3.1 Quali�er Explained
A binary association maps objects between classes. Sometimes it is desirable to partition
the objects of an end class, and re�ne the multiplicity constraints. We will now demonstrate
this with a series of examples, to improve our understanding of quali�er. Each example
further details how an object can be selected out of the set, and why regular multiplicity
constraints do not su�ce.
1. Example 1. Consider the binary association between the class Directory and class
File on Figure 3.2. The intuitive meaning of this simple class diagram is that ev-
ery directory is associated with many (may be zero) �les and each �le is associated
with many (may be zero) directories. This simple modeling situation can be sharp-
ened by adding a quali�er, to re�ect additional constraints. Consider a Unix �le
Figure 3.2: A binary association example.
system, in which each directory that consists of elements (�les, directories or links)
identi�ed by their names. Cardinality constraints can not capture the key role of the
name. But adding a quali�er can, as shown in the next �gure 3.3 : The example
shows how the act of adding a quali�er tightens the multiplicity in the forward di-
27
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
Figure 3.3: Modeling a Unix �le system.
rection over the association. Recall the symbolic notation from chapter 2 : R(rn1 :
SourceClass{(q1, T1)...(qn, Tn)}[min1,max1], rn2 : TargetClass[min2,max2])
The source class in this example is Directory , the target class is File (q1, T1) is
(fileName,Name) and the multiplicity constraints 0, ∗ and 0, 1 respectively.
2. Example 2. Suppose we would like to model a some lookup data structure. In general
case we can just create a simple binary association between two classes, say Array and
Object, where there is many-to-many multiplicity constraint among them. However, we
can be a bit more sophisticated by adding a quali�er to specify non trivial constraints.
The following model in �gure 3.4 shows an array with exactly one object related to
every index in the the array:
Figure 3.4: Modeling an array with quali�er.
As we see from the above examples, when a natural index exists, it is bene�cial to use a
quali�er.
3.2 Discussion on Quali�er Semantics
Today, there is no one particular agreed by all set of formal semantics of UML, and many
researches targeted this issue recently [47, 83]. The semantics speci�ed by OMG [72] pro-
vide only verbal semantics, sometimes accompanied with OCL at the meta model level. In
particular, quali�er must have formal semantics for developing automated tools.
28
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
The general visual form of a quali�er constraint is presented in �gure 3.5, and symboli-
cally denoted R(rn1 : A{(q1, T1)...(qn, Tn)}[min1,max1], rn2 : B[min2,max2]) such that A
stands for SourceClass and B stands for TargetClass.
Figure 3.5: A general quali�er constraint
In order to formulate its semantics consider the Bank example, �gure 3.6. The account
number serves as a quali�er that uniquely identi�es an account.
Without the quali�er, the multiplicity constraint on the Account side is 0..∗ (actually, no
Figure 3.6: A general quali�er constraint
multiplicity constraint). A quali�er constraint might have several attributes, each associated
with a value domain. In that case, the combined value domain of the quali�er constraint is
the cartesian product of the attributes' value domains.
The semantics of a quali�er constraint is combined with its associated multiplicity con-
straint. The general idea is that the combined values of the quali�er attributes impose a
partition on the set of target class instances that are linked to a source class instance. That
is, for an instance s of the source class, a combined value of the attribute identi�es a set of
target class instances that are linked to s. The multiplicity constraint at the target class end,
imposes bounds on the size of the partition classes. There are two di�erent interpretations
for quali�er semantics:
1. Universal semantics: The quali�ed multiplicity constraint applies to every quali�ed
29
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
object of the source class.
2. Existential semantics: The quali�ed multiplicity constraint applies only to source class
objects that already participate in the association.
In this work we adopt the universal semantics. Given a quali�er constraint:
R(rn1 : A{(q1, T1)...(qn, Tn)}[min1,max1], rn2 : B[min2,max2]), Q, its semantics is de�ned
as follows: For a legal instance I, QI is a function that maps every instance of AI and a
combined value of Tq1 , . . . Tqn to a set of BI instances:
QI : AI × Tq1 × . . .× Tqn → 2BI
The mapping QI satis�es the following constraints:
1. The set of BI instances to which an AI instance and a combined domain value are
mapped, is restricted by the r multiplicity constraint on the B end:
∀a ∈ AI , t1 ∈ Tq1 , . . . , tn ∈ Tqn : min2 6 |QI(a, t1, . . . tn)| 6 max2
2. QI is a partition of rI :
(a) The set of BI instances to which an AI instance a and a combined domain value
are mapped, is included in the set rI/a of BI instances that are rI linked to a:
∀a ∈ AI , t1 ∈ Tq1 , . . . tn ∈ Tqn : QI(a, t1, . . . tn) ⊆ rI/a
(b) For a given AI instance, di�erent combined domain values are mapped to disjoint
sets of BI instances:
∀a ∈ AI , t1 ∈ Tq1 , . . . , tn ∈ Tqn , t′1 ∈ Tq1 , . . . , t′n ∈ Tqn : QI(a, t1, . . . , tn) ∩
QI(a, t′1, . . . , t′n) = ∅
After we've seen this formal de�nition of the quali�er constraint, there is one more
sensitive matter (2.b in the above de�nition) that should be clari�ed - the multiplicity
constraints on the quali�ed side of the association. After the quali�er is added to the
30
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
association, as mentioned earlier, combined domain value appears. However, the semantics
of the multiplicity constraints does not change. That is, the min1...max1 constraints are
imposed on the class alone, and not on the combined value. This observation appears since
formally the following conditions should hold on legal instance: ∀b ∈ BI∃a1, ..., ak such that
QI(ai, t1, . . . , tn) = b for di�erent indices i = j implies ai = aj with min1 6 k 6 max1.
Although this detail may seem a bit redundant now, and partially an implication of the
quali�er semantics above, it will be very useful in upcoming sections of this work.
In cases, where the number of possible values is in�nite, lower bound of the target class
should not be strictly larger than zero. We will now elaborate this through a following
example on Figure 3.7 where a part of a class diagram modeling health care system can be
found. Figure 3.7 (a) shows a situation where the lower multiplicity bound of the number
Figure 3.7: Example of multiplicities bounds under di�erent semantic interpretations. (a)Shows unsuitable situation for universal interpretation. (b) Shows the unconstrained modelwith zero lower bound.
of patients for a doctor in a day is 5. That means, under universal semantics, that for each
and every day , there are at least �ve and at most ten patients accepted. The issue with
this situation is the question : what about dates in the past ? Say 100 years ago ? What
about dates in the future ? Since, the value quali�er can get suits into type day, each such
day must have at least 5 patients. The situation solved in �gure 3.7 (b) where the lower
bound is not such restrictive in association serves. Here the lower bound is zero. Actually,
every Doctor instance does not have to have at least �ve links with Patient instance. How-
31
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
ever, this case is less expressive, and some additional explanation either verbal or formal
(like OCL) may be needed and used.
The existential semantics interpretation advocates for saying that the multiplicity con-
straints hold only when the combined quali�er value exists. Put otherwise, in the previous
example on �gure 3.7 (a) we could say: given a day where a doctor actually worked (and
since such a date exists in our system) he accepted at least 5 and at most 10 patients. Such
an interpretation makes it possible using quali�ers in a way quali�er's attribute takes values
from di�erent domains, without specifying where these values actually come from.
3.3 FiniteSat Extension for Quali�er Constraint
In previous chapters we discussed the importance of constraints in general and quali�er
constraints in particular. Quali�er constraints signi�cantly enrich the modeling capabilities
by strengthening multiplicity constraints. However, with more modeling and understanding
capabilities more �nite satis�ability problems arise.
Finite satis�ability problems in the presence of quali�er constraint arise due to cycles of
con�icting multiplicity constraints that include quali�er constraints with �nite attribute do-
mains.
Figure 3.8: A TV-Network with broadcast schedule example
Figure 3.8 shows a situation where in�nity results from quali�er constraint. The quali�er
constraint implies that in every legal instance, the number of broadcast schedules, b, is 7× t,
where t is the number of TV networks. This is since every one of 7 possible values ofWeekday
32
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
enumeration, together with an instance of TV-Network has a link to a BroadcastSchedule
instance. But, at the same time, t = b by the multiplicity constraints on the networkSchedule
association. The only solution is that both classes are either empty or in�nite.
In this chapter we present a method for detecting �nite satis�ability problems in class dia-
grams that include binary multiplicity constraints, class hierarchy constraint, GS constraints
and quali�er constraint. The method builds on Maraee and Balaban algorithm FiniteSat
[62, 61], which reduces the �nite satis�ability of UML class diagram with above constraints
into a solvability of linear inequalities system and test for the existence of solution.
The algorithm is based on FiniteSat algorithm presented in chapter 2.
Extending the algorithm FiniteSat to account to quali�er constraints, involves the fol-
lowing addition of step (4):
Algorithm 3.1. The FiniteSat Algorithm
Input: A class diagram CD with binary multiplicity constraints, class hierarchy constraints,
GS constraints and Quali�er constraints.
Output: A linear inequality system ΨCD
Method: Insert a variable for every class and association in CD.
1. For every multiplicity constraint, insert inequalities according to the Lenzerini and
Nobili method (see chapter 2).
2. For every class hierarchy B ≺ A constraint, B being the sub-class with variable b, and
A being the super-class with variable a, add the inequality a > b.
3. For every GS constraint GS(C,C1, ...Cn; Const), C being the super-class with variable
c, Cis being the subclasses with variables ci, and Const being the GS constraint, add
n class hierarchy inequalities c > ci, i = 1, n, and the following inequalities:
33
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
− Const = disjoint: c >n∑
j=1cj
− Const = complete: c 6n∑
j=1cj
− Const = incomplete: ∀j ∈ [1, n].c > cj
− Const = overlapping: Without inequality
− Const = disjoint, incomplete: c >n∑
j=1cj.
− Const = disjoint, complete: c =n∑
j=1cj.
− Const = overlapping, complete: c <n∑
j=1cj.
− Const = overlapping, incomplete: ∀j ∈ [1, n].c > cj.
4. For every quali�er constraint Q, given as R(rn1 : A{(q1, T1)...(qn, Tn)}[min1,max1], rn2 :
B[min2,max2]), (as described in Figure 3.5):
(a) If the combined domain value of Q is non-�nite, ignore the quali-
�er constraint, and handle the association according to the Nobili and
Lenzerini method [56] .
(b) Otherwise, extend the inequality system with the following inequalities:
min1 × b 6 r 6 max1 × b
min2 × a× tq1 × . . .× tqn 6 r 6 max2 × a× tq1 × . . .× tqn
3.3.1 Correctness and Complexity of FiniteSat
Next is given an extension of the original algorithm proof, which can be found in [11].
The correctness of Algorithm FiniteSat is proved in two steps:
1. First, �nite satis�ability of a class diagram CD is reduced to the �nite satis�ability of
a class diagram CD′, that does not include class hierarchies and quali�ers.
2. Second, �nite satis�ability of CD′ is reduced to solvability of the inequality system
produced by FiniteSat .
34
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
The reductions depend on the structures of class hierarchies, and the presence of GS con-
straints. The �rst step reduction does not hold for class hierarchy structures whose graphs
include cycles with disjoint or complete constraints1.
Quali�er constraints with a non �nite combined value domain are removed (recall that
their minimum multiplicity constraint is 0). A quali�er constraint as in Figure 3.5 is replaced
by two associations and a new class, as in Figure 3.9. The instances of the new class Q stand
for all combinations of an A instance and a value in the combined domain value of Q.
Figure 3.9: A reduced quali�er constraint
3.3.1.1 Reduction of Finite Satis�ability to a Class Diagram without Class
Hierarchy Constraints and without Quali�ers
Translation of CD to CD′:
1. Initialize CD′ by CD.
2. Replace every GS constraint GS(C,C1, . . . , Cn; constraint) by n class hierarchy con-
straints C1 ≺ C, . . . , Cn ≺ C.
3. Replace all class hierarchy constraints with new regular binary associations (termed
henceforth ISA associations) between the super-class to the subclasses. The multiplic-
ity constraints on these associations are 1..1 participation constraint for the subclass
(written on the super class end in the diagram) and 0..1 participation constraint for
the super class. Figure 3.10-b shows the reduced class diagram of Figure 3.10-a.
1For details of analysis of structure of class diagrams see [11]
35
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
4. For a GS constraint GS(C,C1, . . . , Cn; const) in CD, if ISA1, . . . , ISAn are the
associations in CD′ that replace its n class hierarchy constraints (entry (2) above),
insert in CD′ a GS constraint const' on these ISA associations, as follows:
(a) const = disjoint :
const' = �every object e of C may participate in exactly one link of the associa-
tions ISA1, . . . , ISAn� (a xor-constraint on the ISA associations).
(b) const = complete:
const' = �every object of C participates in an ISA association link�.
(c) const = incomplete:
const' = �there exists an object of C that does not participate in any ISA asso-
ciation link�.
(d) const = overlapping :
const' = �there exists an object of C that participates in at least two ISA asso-
ciation links�.
If const is a pair constraint, insert in CD′ a constraint that combines the constraints
of its components.
5. Replace every quali�er constraintQ, given asR(rn1 : A{(q1, T1)...(qn, Tn)}[min1,max1], rn2 :
B[min2,max2]),2 with a new class Q and associations rq and r′, with multiplicity con-
straints 1 to tq1 ∗ .. ∗ tqn and min1..max1 to min2..max2 respectively, as shown on
�gure 3.9.
Note: The FiniteSat algorithm adopts the strict interpretation for the overlapping and
incomplete constraints, that requires the existence of at least one instance with overlap-
ping and incomplete covering, respectively. The constraint const' in CD′ re�ects the strict
semantics.
2as described in Figure 3.5
36
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
Figure 3.10: A class diagram reduction
Mapping instances between CD and CD′:
1. Γ � Mapping an instance I of CD to an instance I ′ of CD′:
(a) I ′ has the semantic domain of I, same class extensions and association extensions,
for all associations in CD and additional class and association extensions for
associations added due to quali�er constraint3.
(b) For every class hierarchy constraint D ≺ C in CD (including class hierarchies
that are implied from GS constraints), and e ∈ DI : If the corresponding ISA
association is ISAD, relate e in I ′ by ISAD, i.e., (e, e) ∈ ISAI′D.
(c) For every quali�er Q, every instance e of an associated object with quali�er value
is related by Γ(e) to two new objects ai and qi with link between them. The
target object bi is linked to qi. Thar is, (ai, qi) ∈ rqI′ , and (qi, bi) ∈ r′I
′We
visualize this process in the �gure 3.13
2. Γ′ � Mapping an instance I ′ of CD′ to an instance I of CD:
(a) Collapse ISA linked objects: For every structure of ISA linked objects o1, . . . , on
in I ′, insert a single new object o to all classes of the objects in the structure:
o = Γ′(oi), for i = 1, . . . , n. Figure 3.11 demonstrates the Γ′ mapping. The
intuition is that CD′ splits a single instance object of CD into its components in
its ancestor classes.
3Like it was de�ned in the translation phase
37
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
(b) Populate CDI classes with the rest of I ′ objects that are not ISA related: For
every o ∈ CI′ , such that o is not ISA linked: Γ′(o) = o ∈ CI .
(c) Preserve association extensions: For every regular (not Rq) association a in CD′,
and link (o1, o2) ∈ aI′, insert the link (Γ′(o1),Γ
′(o2)) into aI . Second, the Rq links
are not mapped (intuitively they shrink into combined quali�er value) . The r′
links are mapped into links between the target object and combined domain value
with quali�ed object, between Γ′(e1) and Γ′(e2) like in Figure 3.124.
Figure 3.11: The Γ′ mapping of a CD′ instance to a CD instance
Figure 3.12: Γ′ Mapping of instances
4e1 is quali�er combined value in I ′
38
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
The goal now is to show that the above mappings preserve legal instances. That is, if I
is a legal instance of CD, then I ′ is a legal instance of CD′, and vice versa. While this is
immediate for the Γ mapping, it is not always true for Γ′.
I. Preserving �nite satis�ability from CD to CD′ � The Γ translation:
Claim 3.1. If CD is �nitely satis�able, then CD′ is also �nitely satis�able.
Proof. Let I be a non-empty �nite legal instance of CD, and denote I ′ = Γ(I). All mul-
tiplicity constraints on regular associations and on ISA constraints are satis�ed. It re-
mains to show that I ′ satis�es the corresponding GS constraints: For a GS constraint
GS{C,C1, ...Cn;Const} in CD:
1. Const=disjoint : The extensions of C1, ...Cn in I are pairwise disjoint. Therefore, for
an object e ∈ CiI , (e, e) ∈ ISAI′
i and for each j = i, (e, e) /∈ ISAI′j . Hence the
xor-constraint is satis�ed.
2. Const=complete: An object e ∈ CI is also an object of at least one subclass CiI .
Therefore, (e, e) ∈ ISAiI′ .
3. Const=incomplete: There exists an object e ∈ CI which does not belong to any
sub-class of C. Therefore, it does not participate in any ISA-link in I ′.
4. Const=overlapping : There are two classes from C1, ...Cn that are overlapping in I. If
e ∈ CiI ∩Cj
I , then (e, e) ∈ ISAI′i ∩ ISAI′
j .
The proofs for pair constraints are obtained by combining the proofs of the single constraints.
For every instance e of an associated object with quali�er Γ(e) turns to be two new
objects a1 and q1 with link between them. The target object b1 is linked to q1. We visualize
this process in the �gure 3.13.
Auxiliary claim. I ′ is legal, fully �nite instance of CD′. That is:
1. I ′ is �nite and has non empty instances for all classes.
39
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
Figure 3.13: The Γ Mapping
2. I ′ satis�es multiplicity constraints of CD′.
Proof. 1. I ′ is instantiated using the mapping Γ applied to I that is legal fully �nite
instance of CD. In each step, constant amount of new objects are instantiated, since
I ′ is �nite. All objects of I are mapped to their corresponding objects in I ′, thus I ′ is
also fully instantiated.
2. Note that the only constraints imposed on the class diagram that are newly added in
CD′ are the multiplicity constraints added by the reduction on the newly created class
that represents the quali�er combined value. The constraints are not violated since it
is set to be the size of the domain combined quali�er value belongs to, and since it
is �nite - no more and no less new objects can be instantiated in I ′ by Γ out of legal
instance I of CD.
II. Preserving �nite satis�ability from CD′ to CD � The Γ′ translation:
The Γ′ translation might map a legal instance I ′ of CD′ into an illegal instance I of
CD. Therefore, it is necessary to characterize legal instances of CD′ whose Γ′ translation
40
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
yields a legal CD instance. The single class property de�ned below (for details see [61, 11])
guarantees that Γ′(I ′) is a legal CD instance, for a legal CD′ instance I ′.
De�nition:[Single class property] An instance I ′ of CD′ has the single class property
if every structure of ISA-linked objects does not include two objects from the same class.
In [11] Balaban and Maraee characterize thew above property and explore when its
existence in class hierarchy structures which satisfy it, providing the following result:
Claim 3.2. If a non-empty, �nite legal instance I ′ of CD′ satis�es the single class property,
then Γ′(I ′) is a non-empty, �nite legal instance of CD.
Proof. See [11].
Claim 3.3. Single class property existence
1. If CD has a tree or acyclic class hierarchy structure, then every legal instance of CD′
satis�es the single class property.
2. If the class hierarchy structure of CD does not include cycles with a disjoint or a
complete constraint, then a �nitely satis�able CD′ has a non empty �nite legal instance
I ′′ that satis�es the single class property. Moreover, I ′′ can be, e�ciently constructed
from any non empty �nite legal instance.
Proof. See [11].
Next is given the claim justifying quali�er-mapped objects.
Claim 3.4. I is legal �nite instance of CD. That is:
1. I is �nite and has non-empty instances for all classes.
2. I satis�es quali�er constraints of CD (all the rest of constraints remain unchanged).
Proof. 1. I is �nite since it is constructed from I ′ that is �nite. All classes in I are not
empty since all classes in I ′ are not empty, and the only objects that do appear in I ′
and do not appear in I are those of quali�er domain values.
41
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
2. We now show that I satis�es all quali�er constraint speci�cations.
(a) Multiplicity constraints on the target end. Multiplicity constraints, namelymin2..max2
bounds are satis�ed in I ′. Since the link of association α simply changes the left
end from Q object to combined quali�er end, the associated multiplicity con-
straints in I are not violated.
(b) Partition constraints.
i. Correct mapping. It is obvious from T that the set of BI instances to which
and AI instance a and combined domain value is mapped is included in the
set of instances where a is linked to.
ii. Disjointness of target sets. Consider two di�erent objects q1 and q2 in I ′ that
are linked to the same object a of the A class (quali�ed class) in I ′. In case
the objects q1 and q2 are linked to di�erent instances of B the disjointness
is not violated in I, by T . The confusing situation arises when they are
linked to the same object of B, say b1. In this case we show there still exists
another legal instance of CD′, without this situation. The construction of
such is being made by splitting the object b1 into two distinct objects b11 and
b12 that are linked to q1 and q2 respectively. In order to satisfy all constraints
imposed on class B in CD we copy all the the rest of I ′ linked to b1 to
both b11 and b12 objects. If such situation still exists with another instance
of Q we repeat this process again.
We further give a visual elaboration of the above proof. In a case where two di�erent
objects q1 and q2 in I ′ are linked to di�erent objects of A each, the disjointness is not
violated in I by T . For demonstration consider �gure 3.14. In a case where more then one
Q object is linked to an A object we get a situation shown on �gure 3.15. Now recall the
most complicated and confusing case, where two objects q1 and q2 are linked to the same
42
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
Figure 3.14: Γ′ Mapping on di�erent A objects.
Figure 3.15: Γ′ Mapping on the same A object.
object of A and the same object of B. The demonstration of splitting the B object into two
new objects can be seen on �gure 3.16.
Based on Claims 3.1, 3.2, 3.3 and 3.4 we get the main result for reducing �nite satis�a-
bility between class diagrams:
Theorem 3.5 (Reduction of �nite satis�ability between class diagrams). Let CDM,CH,GS,Q
denote class diagrams with multiplicity constraints, class hierarchy, GS and quali�er con-
straints, and CDM,GS denote class diagrams as in the CD′ construction, with multiplicity
constraints and GS constraints on ISA associations.
43
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
Figure 3.16: Γ′ Mapping on the same A object and same B object.
1. Finite satis�ability in CDM,CH,GS,Q is reducible to �nite satis�ability in CDM,GS, for
class diagrams in CDM,CH,GS,Q without class hierarchy cycles that include a disjoint
or a complete constraint.
2. A �nitely satis�able class diagram in CDM,CH,GS,Q can be e�ectively translated into a
�nitely satis�able class diagram in CDM,GS.
Corollary 1. If a class diagram CD in CDM,CH,GS,Q is translated into a non-�nitely satis�-
able class diagram in CDM,GS , then CD is also non-�nitely satis�able.
3.3.1.2 Reduction of Finite Satis�ability of CDM,GS to solvability of the inequal-
ity system produced by FiniteSat
Claim 3.6. If CD′ is �nitely satis�able, then ΨCD is solvable.
Proof. There are four kinds of inequalities introduced for multiplicity, class hierarchy , GS
constraints and quali�ers.
1. Multiplicity constraint inequalities: Satis�ed, as shown in [56].
2. Class hierarchy inequalities: Satis�ed, as shown in [11].
44
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
3. GS constraint inequalities: Satis�ed, as shown in [11].
4. Quali�er constraint inequalities: As shown earlier translation description of CD
into CD′, the quali�er disappears, and newly added class Q is added with multi-
plicity constraints. The solution for these new multiplicity constraints inequalities is
the same as for quali�er inequalities added by the FiniteSat algorithm. Therefore,
CD′ has only multiplicity constraints which represent quali�er constraint from CD.
Multiplicity constraints inequalities are satis�ed as shown in [56].
Claim 3.7. If ΨCD is solvable, then CD′ is �nitely satis�able.
Proof. ΨCD contains only inequalities described in claim 3.6, and the proof for these in-
equalities kinds has been shown in [56, 11].
These claims prove the second step of FiniteSat correctness:
Theorem 3.8 (Reduction of Finite Satis�ability of CDM,GS to inequality solvability). Fi-
nite satis�ability in CDM,GS is reducible to solvability of the inequality system produced by
FiniteSat.
Putting together the results of the two step proof (theorems 3.5 and 3.8), we obtain the
correctness theorem for FiniteSat :
Theorem 3.9. FiniteSat correctness � Reduction of Finite Satis�ability of CDM,CH,GS,Q
to inequality solvability
1. Finite satis�ability in CDM,CH,GS,Q is reducible to solvability of linear inequalities, for
class diagrams in CDM,CH,GS,Q without class hierarchy cycles that include a disjoint
or a complete constraint. The reduction is given by the FiniteSat algorithm.
2. A �nitely satis�able class diagram in CDM,CH,GS,Q can be e�ectively translated by
FiniteSat into a solvable linear inequality system.
45
Chapter 3: FiniteSat Algorithm : Extension to Qualifier
Corollary 2. If the application of FiniteSat to a class diagram CD in CDM,CH,GS,Q returns
an unsolvable inequality system, then CD is non-�nitely satis�able.
Claim 3.10 (FiniteSat Complexity). The construction of the inequalities by FiniteSat,
and their number is O(n) where n is the number of constraints in the class diagram.
Proof. Every constraint contributes a constant number of inequalities.
46
Chapter 4
Practical Occurrence of Finite
Satis�ability in Class Diagrams
In this chapter we introduce a series of experiments based on the implementation described
in appendix A. We also propose a series of class diagram's metrics that are relevant to
�nite satis�ability property of a class diagram. While considering these metrics, we pro-
grammatically generate large class diagrams and run controlled experiments to demonstrate
the relevance of this research and set up a basis for benchmarking.
4.1 Class Diagram's Metrics
Many di�erent kinds of software metrics have been developed during last years. Among these
metrics, not only there is the primitive LOC (lines of code) metrics that was very descriptive
several decades ago but is not su�cient today, but also object oriented programming metrics
studied by Chidamber and Kemerer in the beginning of 1990's [26] along with recent work
describing metrics for UML models directly in [52, 60]. In this work we are interested in the
metrics that both possibly cause �nite satis�ability problem and describe the complexity of
a class diagram. The following metrics are the most useful in describing class diagram size
47
Chapter 4: Practical Occurrence of Finite Satisfiability in Class Diagrams
and structural complexity:
1. NCM - Number of the Classes in a Model
2. NASM - Number of the Associations in a Model
3. NSUBC - Number of Subclasses of a Class
4. NSUPC - Number of Superclasses of a Class
5. DIT - Depth of Inheritance Tree
We also introduce three more metrics below. These metrics are of our interest due to their
straightforward impact on the �nite satis�ability of a model.
1. CY CM - Number of cyclic inheritance structures in a model
2. NGS - Number of generalization set constraints in a model
3. NCMNASM - Classes to associations ratio in a model
The last three metrics above, are less conventional properties of class diagrams, but seem
to be very relevant in light of testing their e�ect on �nite satis�ability problem as shown by
Maraee and Balaban in [61, 62] - we contribute to this knowledge by showing the existence
of �nite satis�ability through synthetic experiments. There are also other metrics de�ning
class diagram complexity. Some of them are derived from the UML meta model [72]. Others
derive from experiments about cognitive complexity of class diagrams in [60] and applying
mathematical techniques, like principal component analysis. These metrics also relevant to
structural complexity of class diagram, but are less relevant to �nite satis�ability problem1,
and thus are omitted in our experiments.
In the following sections we describe a series of experiments on large class diagrams. By large,
we mean class diagrams with hundreds of classes and multiple constraints imposed of them
1These metrics deal with properties and constraints in di�erent details resolution, not a�ecting �nitesatis�ability directly
48
Chapter 4: Practical Occurrence of Finite Satisfiability in Class Diagrams
out of the metrics above. The class diagrams used for the experiments are automatically
generated in a symbolic representation to be tested on our implementation described in
previous chapter. Each experiment is used to �nd out some property, and the results are
presented over class diagrams with di�erent sizes and multiple repetitions.
4.2 Experiment 1
The problem of model's �nite satis�ability gained more attention in the past years within
the research MDA community. However, for the best of our knowledge, it was not clearly
shown that the problem does exist among models. In the following series of controlled
experiments we advocate for the existence of such a problem. The experiment presented in
the following section does not provide a formal proof for existence of a �nite satis�ability
problem, it rather deals with the lack of real-world benchmarks or problem sets as best e�ort
approximation.
The steps taken in the experiment are:
1. Generate a class diagram with NCM classes and NASM associations
2. Check the class diagram for �nite satis�ability
3. Make k repetitions to achieve more precise results
Let's elaborate the three steps of the experiment above.
1. Class diagram generation. The best way to describe this step, is to compare a
class diagram to a random graph model, �rst studied by Erdos and Renyi [33]. In this
model, undirected edges are placed at random between a �xed number n of vertexes to
create a graph in which each of the 12n(n− 1) possible edges is independently present
with some probability p. The only exception is that we allow duplicate edges. In the
view of this, we take classes for vertexes and associations for edges. The probability p
varies together with di�erent values given to NCM and NASM in every generation
49
Chapter 4: Practical Occurrence of Finite Satisfiability in Class Diagrams
of a class diagram. The output of this step is a �le with a symbolic representation of
a model in USE [37] format. The process of class diagram generation in detail:
(a) Create NCM classes.
(b) Do NASM times
i. Choose two random classes. (That is why duplicate edges appear. We allow
several di�erent associations between two classes)
ii. Create association between the chosen classes
iii. Impose random multiplicity constraints on both sides of the association cre-
ated.
2. Checking for �nite satis�ability. After the class diagram was generated, it is tested
for �nite satis�ability. This is done via our implementation of FiniteSat [65] algorithm
described in appendix A. The input is the symbolic representation of generated class
diagram, and the output is boolean answer considering existence of �nite satis�ability
problem.
3. Repeating over and over. The idea of the experiment is to show the existence of
�nite satis�ability problem in large class diagrams. In order to show this phenomena
in a convincing way, we repeat the generation and the test multiple times. Collecting
the results of this experiments enables us to show quite a precise results, upon a
representative sample.
We now present the results of the described above experiment. Table 4.1 demonstrates the
results of the experiment for existence of �nite satis�ability problems in huge class diagrams.
By huge we mean hundreds of classes and associations between them. For every such ex-
periment 100 and 1000 repetitions have been made in order to demonstrate high probability
results. The �rst column of the table is how much classes we generated in the diagram. The
second column is the number of associations. The last two columns state what is the fraction
50
Chapter 4: Practical Occurrence of Finite Satisfiability in Class Diagrams
Table 4.1: Existence Experiment Results
NCM NASM FS in 100 repetitions FS in 1000 repetitions
50 10 8% 12.1%50 25 34% 35.1%50 50 91% 87.1%100 50 29% 25.9%100 100 96% 89.2%200 100 31% 27%200 200 100% 96.4%500 100 12% 6.8%500 250 24% 27.4%500 500 100% 99.5%
of the diagrams generated, in which the problem of FiniteSat was present. It is worth to
note, that the class diagrams generated where very constrained. Multiplicity constraints
were randomly selected in the range [1..10]. There is another interesting issue to consider
in the presented results on table 4.1. Speci�cally the NCMNASM ratio. Is there any point to
consider class diagrams where the number of associations is less then number of classes in
order of magnitude? Well, there is! It seems that in many real world class diagrams a major
part of associations between classes are unconstrained. Put otherwise, they have many-to-
many [0..*] multiplicity constraint, that is actually unconstrained. Since very studied reason
for �nite satis�ability problems is a cycle of con�icting multiplicity constraints[61, 65, 11],
such associations are not of our interest. Notice, that the associations generated in our class
diagrams are all constrained. Therefore, class diagrams where NCMNASM is any �xed number
greater than one, can easily describe a real world model.
The next experiment we attend to perform over the existence problem is adding more con-
straints and parameters into the class diagram being generated. The parameters of Num-
ber Of Subclasses (NSUBC), Number of Superclasses (NSUPC), Depth of Inheritance Tree
(DIT), Cyclic Inheritance Structures (CYCM) and Number of Generalization Sets (NGS)
will added to class diagrams. NSUBC and NSUPC increase the complexity of the diagram
in a straightforward way, while DIT, CYCM and NGS clearly cause more �nite satis�ability
51
Chapter 4: Practical Occurrence of Finite Satisfiability in Class Diagrams
Table 4.2: Scalability Experiment Results
NCM NASM Running Time in mili-seconds
50 10 750 25 1250 50 37100 50 112100 100 376200 100 899200 200 2963500 100 4134500 250 13514500 500 462291000 100 174041000 500 91170
problems that need special algorithmic treatment, as have been shown in [24, 62, 65, 61, 64].
It might be interesting to check �nite satis�ability problem occurrence through coverage of
problems patterns, which were shown in [12].
4.3 Experiment 2
A very important question we deal with in this section is : Is our algorithm scalable and
appropriate for large models? Using linear programming methods and Java implementation
make this question non trivial. In the following results we show the running time of our
algorithm on large generated models. Every row in table 4.2 shows the average running time
over 100 repetitions. The experiment performed on a machine with dual core CPU 1.3 GHz,
2 GB RAM memory on Windows Vista operating system with algorithm implementation2
described in appendix A. Note, that these results demonstrate very high performance.
Consider the last row saying that a class diagram with 1000 classes and 500 constrained
associations can be tested for �nite satis�ability in 1.5 minutes! Other results demonstrate
all, by far less than one minute running time. This raises an interesting question on the
need of incremental techniques for �nite satis�ability testing. It seems the only reason of
2available at http://www.cs.bgu.ac.il/ modeling/
52
Chapter 4: Practical Occurrence of Finite Satisfiability in Class Diagrams
doing is keep the modeler of being annoyed by waiting a minute for the CASE tool answer
after clicking on the FiniteSat test button.
4.4 Experiment 3
In this section we describe an experiment which aim is to determine when the problem of
�nite satis�ability arises. In order to do so, we performed a set of consecutive experiments
of 100 repetitions, by raising the number of constrained associations. The number of classes
in the generated class diagram was constant - 50. The number of associations varied from
one to 50. The following �gure 4.1 plots the results we obtained. We see that when we
have 50 classes and 50 associations there is 0.83 probability (normal distribution considered,
which might be di�erent in real world applications) to have �nite satis�ability problem on
randomly generated class diagram. On �gure 4.2 results of the analogous experiment are
Figure 4.1: Appearing FiniteSat on 50 Classes Diagram
presented. This time the class diagram size was enlarged. We started with 100 classes, and
added up to 100 associations. Like in the previous case, we run the experiment 100 times
for each case. While in each case we generated random class diagram from scratch. We see
that the probability to encounter �nite satis�ability problem rises together with the number
of associations. This is intuitively clear - the more constrained the model is , the more
contradictions appear.
53
Chapter 4: Practical Occurrence of Finite Satisfiability in Class Diagrams
Figure 4.2: Appearing FiniteSat on 100 Classes Diagram
54
Part II
Metrics
55
During the experiments that explored �nite satis�ability occurrence and relevance, that
were described in chapter 4 several problems arose. The main problem was the lack of
�exibility in metrics changes. Consequently, a method independent of metrics choice was
developed. The method automatizes benchmarks creation given a metric suite. The method
is based on:
− Abstraction of metrics speci�cation - general patter for metrics language.
− Using a model checker for benchmarks creation:
� Using a meta model - that is, abstract syntax of the model.
� Metric suite for a meta model. Creating s benchmark which is a meta model
instance, �tting the metric suite.
In this part the following topics are discussed:
1. What is a metric?
2. What are the rules for determining the credibility of metric suite.
3. How do model benchmarks look like?
4. How metrics can be used to generate benchmarks for modeling problems.
56
Chapter 5
Metrics
Metrics needed for controlling processes, since only measurable processes can be possibly
controlled. Decision making becomes better when grounding the decision making process
on given numbers, describing what is being controlled, rather than intuitive feelings and
observations. Genero et,.al. [36] notice that in a marketplace of highly competitive products,
the importance of delivering quality software is not only an advantage, but a necessary factor
for software companies to be successful. It is widely accepted in software engineering that the
quality of a software system should be assured from the initial phases of its life cycle. Class
diagrams in particular is an artifact often available in early stages of software development,
thus serves as natural subject for metric characterization of external attributes, such as
coupling, complexity, etc., separately from its behavior.
De�nition: Metric. We de�ne a metric µ to be a non-negative function on a speci�c
domain of models to the set of real numbers R.
It is due to speci�c need more properties can be asked from metrics to satisfy.
57
Chapter 5: Metrics
5.1 Weyuker's Characterization of Metrics Properties
Weyuker [89] proposed a set of properties for software metrics evaluation. The properties
were successfully used by Chidamber and Kemerer at very cited work [27] on object oriented
design metrics. Today, Weyuker's properties are still relevant, and di�erent interpretations
are examined in [71]. Original Weyuker's properties, examined metrics of classical sequen-
tial programs consisting of program statements. Examples of such programs, are programs
written in Pascal, C or Fortran languages. According to Weyuker [89] one way to think of
a program is as an object made up of smaller programs. Using this point of view, the ba-
sic operation in constructing programs is composition, which is concatenating two program
bodies. P ;Q is the program body formed by appending the program body Q immediately
following the last statement of P . Weyuker uses |P | to denote the complexity of P , with
respect to some hypothetical measure, and that |P | is a non-negative number. Below is
given a list of properties to evaluate metrics for such programs:
1. Property 1: (∃P )(∃Q)(|P | = |Q|). This is a property of any general metric. Surely,
a metric which rates all programs equally is not really a metric.
2. Property 2: (∃P )(∃Q)(P ≡ Q, |P | = |Q|).1 This property considers syntactic com-
plexity metrics. That means,complexity of the program is being measured, not the
function being computed by the program.
3. Property 3: Let c be a non negative number. Then there are only �nitely-many
programs of complexity c. This property needed to strengthen the �rst property.
4. Property 4: (∀P )(∀Q)(|P | 6 |P ;Q|, |P | 6 |P ;Q|). This property states, that the
components of a program are no more complex than the program itself2.
5. Property 5: To answer the question whether or not the concatenation of a given
1|P | ≡ |Q| means P and Q compute the same function2Weyuker terms this property monotonicity.
58
Chapter 5: Metrics
program body with other program bodies should always a�ect the complexity of the
resultant program body in a uniform way.
(a) (∃P )(∃Q)(∃R)(|P | ≡ |Q|, |P ;R| = |Q;R|)
(b) (∃P )(∃Q)(∃R)(|P | ≡ |Q|, |R;P | = |R;Q|)
6. Property 6: There are program bodies P and Q such that Q is formed by permuting
the order of statements of P , and |P | = |Q|. This property asserts that program
complexity should be responsive to the order of the statements, and hence the potential
interaction among statements.
7. Property 7: If P and Q are almost identical3 then |P | ≡ |Q|. This property examines
the question: what kind of syntactic modi�cations should leave the complexity of a
program unchanged.
8. Property 8: (∀P )(∀Q)(|P | + |Q| 6 |P ;Q|). This property examines the question:
should the complexity of a program body be no less than the sum of the complexities
of its components.
The properties above, are interesting not only because they can help choosing metrics suite,
but also because they are useful for examining metric strengths and weaknesses thus com-
paring existing metrics. Chidamber and Kemerer used [27] Weyuker's properties to evaluate
metrics for object oriented design, and later in this work the metrics proposed for models
are also evaluated with Weyuker's properties.
5.2 Object Oriented Metrics
5.2.1 Bunge's de�nition of object complexity
The classes properties as a complexity measures were inspired by Bunge's de�nition of
ontologies [18, 19]. Like any substantial individual, a class possess a �nite number of prop-
3Q is a syntactical transformation of P
59
Chapter 5: Metrics
erties. The properties do not exist on their own, but are attached to individuals. On the
other hand substantial individuals are not simply bundles of properties. A substantial in-
dividual and its properties collectively constitute and object. An object can be represented
as X =< x, p(x) > where x is the substantial individual and p(x) is the �nite collection
of its properties. x can be considered to be the token or the name by which the object is
represented in a system.
Basing on this representation , Bunge de�nes a similarity of two objects X and Y to
be σ(X,Y ) = p(x) ∩ p(y) , following general principle of de�ne similarity in terms of sets.
The complexity of an individual de�ned as a numerosity of its composition, implying that
a complex individual has a large number of properties. Complexity of < x, p(x) >= |p(x)| ,
where |p(x)| is the cardinality of p(x).
5.2.2 Chidamber and Kemerer Metrics and Evaluation
In 1994 Chidamber and Kemerer (C&K) introduced a metrics suite [27] for object oriented
design. The C&K suite consists of six metrics that measure complexity of a single class.
For a class C, C&K de�ne p(c) = {MC} ∪ {IC} where {MC} is the set of methods
and {IC} is the set of instance variables (a.k.a data members) of a class C. Following the
Bunge's de�nition above, a binary operation + on two classes is de�ned. For two classes
X =< x, p(x) > and Y =< y, p(y) > , X + Y is de�ned as < z, p(z) > where z is the token
with which X + Y is represented and p(z) is given by p(z) = p(x) ∪ p(y).
The metrics evaluate a single class in a way that measures di�erent aspects of object
oriented design. The metrics are theoretically grounded on next terms4:
1. Complexity. Numerosity of composition. The properties' cardinality - cardinality of
di�erent sets of methods and instance variables.
4When closely examined they remind Fowler's bad smells[35], especially the Divergent Change bad smell.
60
Chapter 5: Metrics
2. Scope of properties. Re�ects design decisions (DIT,NOC), how classes are arranged
in hierarchy, and how their methods and instance variables a�ect the system. How far
does the in�uence of a property extend?
3. Coupling and Cohesion. Two terms that are used to characterize OO design.
(a) Coupling. Two objects are coupled if and only if at least one of them acts upon
the other.
(b) Cohesion. Following the set theoretic de�nition of similarity, cohesion can be de-
�ned as similarity between two methods. Where the similarity set is the common
instance variables for two methods.
The metrics are evaluated with six of the Weyuker's evaluation properties. The metrics
C&K propose are:
1. Weighted Methods Per Class (WMC). Consider a class C1 with methods M1, ...,Mn
that are de�ned in the class. Let c1, ..., cn be the complexity of the methods5. Then
WMC =∑n
i=1 ci. If all method complexities are considered to be unity, the WMC =
n, the number of the methods, for example WMC(A) = 3 on �gure 5.1.
Figure 5.1: A class diagram
5Complexity is deliberately not de�ned more speci�cally here in order to allow the most general applica-tion of this metric
61
Chapter 5: Metrics
2. Depth of Inheritance Tree (DIT). Depth of inheritance of the class is the DIT metric
for the class. In cases involving multiple inheritance, the DIT will be the maximum
length from the node to the root of the tree. For example, DIT (B) = 1 , DIT (A) =
DIT (C) = 0 on �gure 5.1.
3. Number Of Children (NOC). The number of immediate subclasses subordinated to a
class in a class hierarchy. For example, NOC(A) = 1 on �gure 5.1.
4. Coupling Between Object Classes (CBO). CBO for a class is a count of the number of
other classes to which it is coupled. Two classes are coupled when methods declared
in one class use methods or instance variables de�ned by other class. For example,
CBO(C) = 1 on �gure 5.1, since class C is coupled only to class A.
5. Response For a Class (RFC). RFC = |RS| where RS is the response set for the
class. The response set of a class is a set of methods that can potentially be executed
in response to a message received by an object of that class. It should be noted
the membership to response set is de�ned only up to the �rst level of nesting of
method calls. The set also speci�cally includes methods called outside of the class.
For example, on �gure 5.1 if the methods Foo, Goo and Boo are recursive methods
that call only themselves, we get RFC(A) = 3. To obtain an accurate RFC value a
code or sequence diagrams must be analyzed.
6. Lack of Cohesion in Methods (LCOM) . The LCOM is a count of the number of
method pairs whose similarity is 0 minus the count of method pairs whose similarity
is not zero. Like with RFC metric, in order to compute LCOM the code of the class
must be analyzed. Assume that Foo, Goo and Boo methods of class A on �gure 5.1
do not use common data members, than we get LCOM(A) = 3.
C&K chose six properties out of Weyuker's properties list to evaluate the metrics they
propose. The properties chosen were properties number 1,3,4,5,6,96. The properties were
6They numbered original Weyuker's properties 8.a and 8.b as di�erent properties
62
Chapter 5: Metrics
changed7 to be appropriate to classes as the objects being measured. The rest three metrics
that C&K did not use, don't suite according to classes, but only to sequential programs.
In order to evaluate their metrics, CK de�ned what a class is and what a binary operation +
on two classes mean. Thus, they de�ned a combination of two classes, based in the de�nition
of classes properties. The classes properties as a complexity measures were inspired by
Bunge's de�nition of ontologies.
5.3 Metrics for Models
5.3.1 Background
For many speci�c goals, many di�erent metrics for models have been proposed over the last
decade. A clear goal is speci�ed when a metric is proposed. Many authors proposed metrics
for class diagrams in particular - the survey of such metrics was reported by Marcela et.al., in
[36]. In general metrics can be characterized into several categories. For example, there can
be size or structure metrics [59] which can be used to measure size and internal constraints
in a class diagram. Metrics de�ned by di�erent authors [36] are used for the following goals:
1. Measure design complexity in relation to their impact on external quality attributes
such as maintainability, reusability , etc. For example 3 of 6 metrics Chidamber and
Kemerer propose [27] :
− WMC
− NOC
− DIT
2. Measure di�erent internal properties such as coupling. For example Li and Henry
[57] propose the DAC metric , which is the number of attributes in a class that have
another class as their type.
7transformed to classes, rather than sequential program bodies
63
Chapter 5: Metrics
3. Measure object oriented mechanisms such as inheritance or information hiding. Abreu
and Melo in [1] propose the method hiding factor and attribute inheritance factor as
a part of MOOD metric suite, as such measures.
4. Class diagram complexity. See work by Manso et,al.,[60] for description of cognitive
experiments describing such which metrics contribute to class diagram complexity.
The metrics above, are used to measure existing class diagrams, at various stages of their
development. Many research results demonstrate automatic metric-extraction tools.
5.3.2 Weyuker's Properties Adaptation for Models Metrics
Class Diagram is a visual language that consists of classes, associations and features. These
elements are further restricted by constraints. Following Bunge's de�nition of object's com-
plexity, and inspired by example of Chidamber and Kemerer [27] we de�ne a class diagram
CD = {CCD} ∪ {PCD} ∪ {CONCD} where {CCD} is the set of classes, {PCD} is the set
of properties (this set includes associations in a class diagram) and {CONCD} is the set of
constraints imposed on the class diagram elements.
Consistent Renaming: We de�ne consistent renaming of class diagram, to be such a re-
naming of elements that take e�ect on every place where the renamed element of a class
diagram mentioned. For example, in �gure 5.1 a consistent renaming could be renaming of
class A to X making the attribute att of class C change its type form A to X accordingly.
Ordering: Ordering of a class diagram means replacing the elements of class diagram.
That is - imposing constraints on di�erent classes or properties or imposing an association
on other classes than it was originally imposed on. No elements are added or removed in
the ordering. Intuitively, the elements only change their place in a diagram.
Binary operation + : Combination of class diagrams: Bunge provides an ontology
as a basis for de�ning combination of class diagrams. Combination of two (or more) class
diagrams result in another class diagram whose elements are the union of the elements of
64
Chapter 5: Metrics
the component class diagrams.
Let CD1 = {CCD1} ∪ {PCD1} ∪ {CONCD1} and CD2 = {CCD2} ∪ {PCD2} ∪ {CONCD2} be
two class diagrams. Then CD1 + CD2 is de�ned as:
CD3 = {CCD1} ∪ {CCD2} ∪ {PCD1} ∪ {PCD2} ∪ {CONCD1} ∪ {CONCD2}
The combination is more than just union of the properties sets. The combination is recur-
sive. That is, for example, once two classes having the same name8 they are merged too9.
Semantic equivalence: Two class diagram are semantically equivalent if they have exactly
the same instances. Intuitively, two distinct class diagrams can be semantically equivalent,
but di�erent10 due to transitive class hierarchy constraints, which are speci�ed visually.
Although many researchers propose di�erent ways of combination for software models
in general, and UML class diagrams in particular [30, 68, 76], the above de�nition for
combination is chosen for metrics evaluation, since it does neither rely on user participation
nor on versioning systems.
Merging Invariants: In case the two class diagrams which are being merged are disjoint,
we envision no problems on merging operation. This is not the case where the merging is
applied to two overlapping class diagrams. During combination of overlapping class diagrams
di�erent con�icts may arise (di�erent multiplicity constraints on the same association, etc.)
, depending on the overlapping part.
General syntactic con�icts. If such a con�icts arise, we assume that they are solved by a
pre-de�ned strategy, in a way that does not removes the elements of the resulting
class diagram. Rather it reorganizes the elements of the diagram, without e�ect on class
diagram's properties set sizes. We introduce several invariants that must hold during class
diagrams combination process.
1. Example 1: Multiplicity Constraint Con�ict. In case there is a non trivial multiplicity
8Two classes having the same name in a merging operation represent a con�ict that must be solved withwell de�ned strategy.
9Like Chidamber and Kemerer de�ne in [27]10syntactically di�erent
65
Chapter 5: Metrics
constraint on the same association end in both class diagrams, that rise a con�ict
situation, the non trivial multiplicity constraint remains in the resulting diagram,
regardless of the way it was solved. Consider �gure 5.2, �gure 5.3 and �gure 5.4 where
CD3 = CD1 + CD2 as an example.
Figure 5.2: Class Diagram: CD1
Figure 5.3: Class Diagram: CD2
The multiplicity constraint on the association end stays non trivial, that is, two con-
�icting constraints on CD1 and CD2 are solved in way of creating a new non trivial
constraint, rather than setting it to any and thus unconstraining the association end.
Figure 5.4: Class Diagram: CD3
2. Example 2: Class Hierarchy Constraint Con�ict. In case there are two classes A and
B, such that A is a sub class of B in CD1 and B is a sub class of A in CD2 like
shown in �gure 5.5 and �gure 5.6, without loss of generality, if CD3 = CD1+CD2, A
will remain a subclass of B in CD3. Class B, however, will be subclassing some other
class, say C, in CD3 like it is shown on �gure 5.7. Which class exactly is chosen to
be the C class is decided by a prede�ned strategy. To advocate that this C class can
always be found, consider the Object class in modern object oriented languages such
as Java or C♯, where each and every class is a subclass of Object.
66
Chapter 5: Metrics
Figure 5.5: CD1 Figure 5.6: CD2
Figure 5.7: Class Diagram: CD3
Following are Weyuker's properties and their intuitive meaning in terms of models, rather
than sequential programs:
1. Property 1 - Non-Coarseness. A metric µ satis�es non-coarseness if there exist
two distinct models M1 = M2 such that µ(M1) = µ(M2). This property characterizes
syntactic sensitivity. This implies that not every model can have the same value for a
metric, otherwise it has no value as a measurement. For example, NCMNCM where NCM
is number of classes for a model, is not a useful metrics since it is always equal to 1
for every class diagram model.
2. Property 2 - Finiteness. A metric µ satis�es �niteness if there is a �nite number
of models M with the same value for a metric µ.
3. Property 3 - Non-Uniqueness. A metric µ satis�es non-uniqueness if there exist
67
Chapter 5: Metrics
two distinct models M1 = M2 such that µ(M1) = µ(M2). That means that a metric
measures semantic properties: Two di�erent models can have the same metric value.
For example, it is obvious that two di�erent class diagrams can have di�erent number
of associations or constraints, thus being syntactically di�erent and still having the
same number of classes, that is equal NCM value.
4. Property 4 - Syntactic Sensitivity. A metric µ satis�es syntactic sensitivity if
there exist two di�erent M1 = M2 but semantically equal models M1 ≡ M2 which de-
scribe equal systems such that µ(M1) = µ(M2). Such a metric distinguishes syntactic
di�erences that have no semantic e�ect.
5. Property 5 - Monotonicity. A metric µ satis�es monotonicity if for every two
models M1 and M2, µ(M1) 6 µ(M1+M2) and µ(M2) 6 µ(M1+M2). where M1+M2
implies combination of M1 and M2. Combination of models can not decrease metric
value.
6. Property 6 - Combination Sensitivity. Ametric µ satis�es combination sensitivity
if there existM1,M2 andM3 such that µ(M1) = µ(M2) but µ(M1+M3) = µ(M2+M3).
This property measures sensitivity to model combination: The interaction betweenM1
and M3 can be di�erent than the interaction between M2 and M3 resulting in di�erent
metric values for M1 +M3 and M2 +M3. For example, see the analytical evaluation
of the NCM metric later in this chapter.
7. Property 7 -Ordering Sensitivity. A metric µ satis�es ordering sensitivity if there
exist two models M1,M2 such that M2 is obtained by ordering11 of M1's elements,
and µ(M1) = µ(M2). The metric should be sensitive to permutation of inner elements
inside a model.
8. Property 8 - Consistent Renaming. A metric µ satis�es consistent renaming
11The notion of model ordering must be de�ned in order to evaluate metrics with respect to this property
68
Chapter 5: Metrics
if for every two models M1,M2 such that M2 is obtained by consistent renaming of
M1's elements µ(M1) = µ(M2).The metric should possess indi�erence to consistent
renaming of elements inside model. That is, the metric's value should not change with
a consistent renaming of an element inside a model.
9. Property 9 - Interaction Increases Complexity. A metric µ satis�es interaction
increases complexity property if there exist two models M1 and M2 such that µ(M1)+
µ(M2) < µ(M1+M2). Intuitive meaning is that the metric should re�ect combination
of models. The principle behind this property is that when two models are combined,
the interaction between models can increase the complexity metric value.
What metrics properties are for? According to Weyuker [89], the properties of syntactic
complexity serve a basis for metrics evaluation, and should help to clarify the strengths
and weaknesses of metrics. The properties should allow us to formally compare complexity
models.
5.3.3 Metrics Evaluation
In this section we evaluate some of the metrics proposed for UML class diagrams [52, 60, 36]
and those that were used for synthetic generation in [59].
− Metric 1 : Number of Classes in a Model (NCM).
NCM is de�ned to be the number of classes in a class diagram [52], that is, given CD
as de�ned earlier, NCM = |{CCD}|.
Theoretical basis: NCM relates directly to Bunge's de�nition of complexity of a thing,
since classes are properties of class diagram and complexity is determined by the
cardinality of its set of properties.
Analytical evaluation of NCM
1. Obviously there exist two distinct class diagrams CD1 and CD2 such that µ(CD1) =
µ(CD2), therefore property 1 is satis�ed.
69
Chapter 5: Metrics
2. Property 2 is not satis�ed - another association can be always added to any given
class diagram. This means there is in�nite number of class diagrams having the
same NCM value.
3. There exist two distinct class diagrams CD1 and CD2 such that µ(CD1) =
µ(CD2), therefore property 3 is satis�ed.
4. The same application domain can be modeled by two designers in two di�erent
ways, using di�erent considerations about generalization for instance, creating
di�erent classes in a class diagram. Therefore, property 4 is satis�ed.
5. µ(CD1+CD2) = µ(CD1)+µ(CD2)−σ, where σ is the number of common classes
between CD1 and CD2. Clearly, the maximum value of σ ismin(µ(CD1), µ(CD2)).
It follows that µ(CD1) 6 µ(CD1+CD2) and µ(CD2) 6 µ(CD1+CD2), thereby
satisfying property 5.
6. Let CD1 and CD2 be two class diagrams, such that {CCD1} ∩ {CCD2} = ∅. Let
CD3 be class diagram such that {CCD1} ∩ {CCD3} = ∅. From the de�nition of
class diagram merging above, it follows that µ(CD1 + CD3) = µ(CD2 + CD3).
Therefore property 6 is satis�ed.
7. Property 7 is not satis�ed. Given any class diagram CD, no matter how the
class elements will be ordered, their number will still be �xed. Put otherwise,
removing an association feature or constraint from one class to another, does not
change the number of classes in a diagram.
8. If we rename12 the classes inside a diagram, it is trivially does not change the
number of the classes. Therefore property 8 is satis�ed.
9. From the analysis of property 5 above, we get that property 9 does not hold.
Roughly speaking, the number of classes can not grow due to merging of two
class diagrams.
12We assume that every class appears exactly once in a given class diagram
70
Chapter 5: Metrics
− Metric 2 : Number of non-Trivial Multiplicity constraints (NoTM).
NoTM is de�ned to be the number of non trivial (di�erent from 0 and *) multiplicity
constraints at the ends of binary associations [59].
Theoretical basis: NoTM relates directly to Bunge's de�nition of complexity of a
thing, since constraints in general, and multiplicity constraints in particular are prop-
erties of class diagram, while complexity is determined by the cardinality of its set of
properties. It was shown [59] that NoTM value has a major impact on correctness of
class diagrams.
Analytical evaluation of NoTM
1. Obviously there exist two distinct class diagrams CD1 and CD2 such that µ(CD1) =
µ(CD2), therefore property 1 is satis�ed.
2. Property 2 is not satis�ed - another unconstrained association can be always
added to any given class diagram. This means there is in�nite number of class
diagrams having the same NoTM value.
3. There exist two distinct class diagrams CD1 and CD2 such that µ(CD1) =
µ(CD2), therefore property 3 is satis�ed.
4. Two equal systems can be modeled in two di�erent ways by di�erent designers.
Using di�erent considerations about generalization for instance, creating di�er-
ent classes in a class diagram and imposing di�erent amount of constraints on
associations among these classes. Therefore, property 4 is satis�ed.
5. µ(CD1 + CD2) = µ(CD1) + µ(CD2) − σ, where σ is the number of common
multiplicity constraints between CD1 and CD2. Clearly, the maximum value
of σ is min(µ(CD1), µ(CD2)). It follows that µ(CD1) 6 µ(CD1 + CD2) and
71
Chapter 5: Metrics
µ(CD2) 6 µ(CD1 + CD2), thereby satisfying property 5. We assume that in
case of con�ict between a multiplicity constraints on the same end of binary
association occurs it is solved , but a constraint stays. See [30] for example of
solving such a con�ict.
6. Let CD1 and CD2 be two class diagrams, such that {CONCD1}∩ {CONCD2} =
∅. Let CD3 be class diagram such that {CONCD1} ∩ {CONCD3} = ∅. From
the de�nition of class diagram merging above, it follows that µ(CD1 + CD3) =
µ(CD2 + CD3). Therefore property 6 is satis�ed.
7. Property 7 is not satis�ed. Given any class diagram CD, no matter how the
its elements will be ordered,NoTM value will remain the same. Put otherwise,
removing an association feature or constraint from one class to another, does not
change the number of multiplicity constraints in a diagram.
8. Property 8 is obviously satis�ed.
9. From the analysis of property 5 above, we get that property 9 does not hold.
Roughly speaking, the number of multiplicity constraints can not grow due to
merging of two class diagrams.
− Metric 3 : Number of cycles formed by constrained associations and classes
(NCY C).
Theoretical Basis: Cycles formed by constrained associations and classes cause �nite
satis�ability problem in class diagram when a con�ict between multiplicity constraints
present [61, 9]. Analytical evaluation of NCY C
1. Obviously there exist two distinct class diagrams CD1 and CD2 such that µ(CD1) =
µ(CD2), therefore property 1 is satis�ed.
2. Property 2 is not satis�ed. Given a class diagram CD, another class diagram
CD can always be found by adding a single dummy class to CD, such that
72
Chapter 5: Metrics
µ(CD) = µ(CD).
3. There exist two distinct class diagrams CD1 and CD2 such that µ(CD1) =
µ(CD2), therefore property 3 is satis�ed.
4. Two equal systems can be modeled in two di�erent ways by di�erent designers.
Using di�erent considerations about generalization for instance, creating di�er-
ent classes in a class diagram and imposing di�erent amount of constraints on
associations among these classes. Therefore, property 4 is satis�ed.
5. µ(CD1+CD2) = µ(CD1)+µ(CD2)−σ, where σ is the number of common cycles
between CD1 and CD2. Clearly, the maximum value of σ ismin(µ(CD1), µ(CD2)).
It follows that µ(CD1) 6 µ(CD1+CD2) and µ(CD2) 6 µ(CD1+CD2), thereby
satisfying property 5.
6. Property 6 is satis�ed. Proof by example. Consider the class diagrams on �gures
5.8,5.9 and 5.10.
Figure 5.8: Class Diagram: CD1
Figure 5.9: Class Diagram: CD2
Analyzing the NCY C metric , we get that µ(CD1) = µ(CD2) = 1. However,
µ(CD1 + CD3) = 4 and µ(CD2 + CD3) = 2, see �gure 5.11.
73
Chapter 5: Metrics
Figure 5.10: Class Diagram: CD3
Figure 5.11: Merging of Class Diagram CD1 and CD3
7. Property 7 is satis�ed. Ordering of classes and association among them e�ects
the number of cycles.
8. Property 8 is obviously satis�ed. Consistent renaming of classes does not change
the number of cycles inside a class diagram. However, changing the names inside
one diagram can change the metric value.
9. Property 9 is satis�ed. Proof by example. Consider the example from analysis of
property 6.
It seems that in order for a metric suite to satisfy all Weyuker's properties, the metric should
re�ect size and structure of the class diagram. Consider test coverage of software systems.
Many characteristics, such as intensive unit testing are appropriate only to big and relatively
complex systems, while not useful for simple sequential routine with any if statement in it.
5.3.4 Related Work on Model Metrics
Kim and Boldyre� [52] proposed a list of software metrics that can be applied to UML
models. The metrics proposed in their work based on metamodel scheme snapshot given at
[52]. Among the metrics proposed for model:
74
Chapter 5: Metrics
− NCM - Number of classes in a model
− NIM - Number of inheritance relations in the model
− NPM - Number of packages in a model
− NASM - Number of associations in a model
Metrics proposed for class include:
− NASC - Number of associations linked to a class
− NSUPC - Number of superclasses for a class
They also introduce several metrics for a use case level, but they are omitted in this work
for simplicity and clearance.
Mens and Lanza [69] suggest to express and de�ne metrics using a language independent
metamodel based on graphs. In their work, a type graph is used to specify the object
oriented meta-model. A small core of three generic metrics proposed:
1. NodeCount (NC)
2. EdgeCount (EC)
3. PathLength (PL)
Mens and Lanza combine these generic metrics with object oriented metamodel to express
typical object oriented metrics in terms of generic ones. For example, considering the meta-
model on �gure 5.12 we get:
1. NC(s, system, class) = number of classes in the system c
2. EC(c, class, inheritance, single) = number of children for class c
3. PL(c, class, inheritance,maximal) = depth of class c in the inheritance tree
75
Chapter 5: Metrics
Figure 5.12: An object oriented metamodel
The main advantages of their approach are that it can be very easily automated, as demon-
strated in [69] and combined with each other to obtain higher order metrics including ratio
and summation.
Baroni et,al [14, 13] proposed a formal de�nition of object oriented metrics, using OCL
and the UML metamodel. Their approach involves modifying the metamodel by creating
the metrics as additional operations in the metamodel and expressing them as OCL con-
ditions. A library called FLAME [2] which is a library of metric de�nitions formulated as
OCL expressions over the UML 1.3 metamodel.
McQuilan extends the work of Baroni, by decoupling the metrics de�nition from UML
metamodel, and generalize the approach to any metamodel and any set of metrics [66].
As an example a formal de�nition of CK metrics using OCL over UML 2.0 metamodel is
developed (see �gure 5.13 ) and demonstrated in a prototype tool called DMML (De�ning
Metrics at the Meta Level).
For example consider the NOC metric, as de�ned by McQuilan et,al.
76
Chapter 5: Metrics
Figure 5.13: Extension to the UML 2.0 metamodel. This UML package diagram shows thede�nition of the CK metrics as a separate package, with a dependency on classes from theUML metamodel.
Full list of metrics of the McQuilan's approach can be found at [66]. While the last
approach is similar to Baroni's et al., it di�ers in a number of key areas. McQuilan et al.,
approach can be generalized at the metamodel level, for example, to apply to other UML
diagrams. Their metric calculation procedure is highly extensible, allowing for di�erent
versions to be implemented and compared.
77
Chapter 5: Metrics
Figure 5.14: NOC Metric De�nition. This OCL code de�nes the NOC metrics from the CKmetrics suite, and is part of a larger de�nition of the whole CK metric suite which we haveimplemented using dMML.
78
Chapter 6
Benchmarking
6.1 Metric-Driven Benchmark Creation
6.1.1 Metrics as means for algorithm evaluation
A benchmark is the act of running a computer program, a set of programs, or other
operations, in order to assess the relative performance of an object, normally by running a
number of standard tests and trials against it. Although, benchmarking used to be associated
with assessing performance characteristics of computer hardware, for example, the �oating
point operation performance of a CPU , but there are circumstances when the technique
is also applicable to software. Software benchmarks, for example, run against compilers
or mostly known against database management systems. Another type of test program,
namely test suites or validation suites, are intended to assess the correctness of software.
Benchmarks provide a method of comparing the performance of various subsystems across
di�erent chip/system architectures. Metrics have properties which can characterize the al-
gorithm's quantitative or qualitative performance such as run-time or problem scope. There
is a constant call from the research community to more rigor in experimentation and em-
pirical validation of research results [87, 80]. In this work we advocate for using metrics as
79
Chapter 6: Benchmarking
a primary means for benchmark creation.
But what metrics selection depends on? Two obvious candidates for a�ecting metrics choice
are the problem, and the algorithm solving the problem. We do not see either of these two
candidates as the exclusive reason for selecting the right metric for benchmarking. Consider
the following examples :
1. Example: Graph traversal problem. In this example we consider a problem of
traversing a connected graph G = (V,E). If we want to compare di�erent techniques
like BFS or DFS (see [38] for details) the metrics that would be useful to compare
running time would probably be:
− Number of verticals in a graph.
− Number of edges in a graph.
− Average degree for a vertex in a graph.
These three metrics characterize a graph by size in �rst two metrics, and by structure
in the last metric. Running the traversal algorithms like DFS and BFS on graphs with
di�erent metric values will give us a clue to compare their performance. However, if
we consider the average weight for an edge metric, it has probably nothing to do with
comparing graph traversal algorithms, since they are not in�uenced by edge weight
when traversal is run. Algorithms solving problems of shortest path and maximal �ow
rather, are dependent on this metrics as a weight and capacity respectively.
2. FiniteSat Algorithm. Creating a benchmark for FiniteSat would probably use the
following metrics as a main course of interest:
− Comparing scalability of algorithms:
� Number of classes in a class diagram.
� Number of non trivial multiplicity constraints in a class diagram.
− Comparing complex problem instances of structure:
80
Chapter 6: Benchmarking
� Number of cycles in a class diagram.
� Ratio between number of classes to number of associations
� Di�erent ratio metrics describing the structure of class diagram.
Relevance of �nite satis�ability in class diagram clearly depends on these metrics.
However, the metric of Number of trivial multiplicity constraints has no impact on
�nite satis�ability of class diagram.
6.1.2 Brute-Force Benchmark Creation Without Abstraction
Relevance of �nite satis�ability was studied earlier in this work, in chapter 4. The problems
created for the experiments were class diagrams. The class diagram were created by conse-
quent running of a sequential program mainly consisting of for statements which generated
text �le, conforming the format of USE [37] and checked for presence of �nite satis�ability
with software presented in appendix A. The generation code was very in�exible, since it
depended on very concrete metrics used, and adding each and every metric to the generation
process required a lot of work. The work needed to be done for an additional metric required
not only the code for generating elements measured by metric itself, but also hard-coding
of adaptation in code of all other metrics, ensuring there is no any collision both in metrics
and syntax. The proses can be easily compared to parsing of a text without a grammar
tools, but rather with direct developing and coding of the parser program.
6.2 Benchmark Creation via Model Checking
6.2.1 Introduction to Alloy
Jackson introduced Alloy [46] as a little language for describing structural properties. It can
be used today asmodel �nder, based on SAT solver. The Alloy Analyzer works by translating
the model speci�ed in Alloy language into a boolean expression, which is analyzed by SAT
solvers embedded within the Alloy Analyzer. A user-speci�ed scope on the model elements
81
Chapter 6: Benchmarking
bounds the domain, making it possible to create �nite boolean formulas for the evaluation
by SAT solvers. The Alloy Analyzer o�ers two analysis methods. The �rst is simulation
and the second is assertion checking [4]. In this section we will see how to adopt Alloy for
generating models along speci�ed metric values. In short, using Alloy, a model can be built
by using:
− Signatures. Signatures are used to model classes of objects, that is sets.
− Predicates. Predicates give us a way to �nd model instances - to write a predicate
and then make Alloy produce instances that satisfy this predicate. Asking Alloy to
�nd instances is similar to �nding a model of a given schema.
− Facts. Facts are used to impose constraints on model. Facts are global, and apply
always. Put otherwise, every instance of a model must satisfy the facts.
− Functions.Function is an expression that returns a result.
− Assertions. These are assumptions about the model that you can ask the analyzer
to �nd counter-examples of.
After the model is built, its assertions can be veri�ed, with an attempt to �nd a counter-
example. Alloy performs an exhaustive search in a limited space eliminating the possibility
of missing an instance. The scope of instance search should be speci�ed.
Recently, Shah et,al. demonstrated a way to analyze UML class models [79] by trans-
forming them into Alloy models with consequent transformation of Alloy produced instances
back to UML object diagrams.
Zito and Dingel [90] used Alloy to model package merging in UML. They used Alloy for
formalizing and analyzing di�erent versions of package merge.
Bordbar and Anastasakis [16] used Alloy Analyzer as a tool for veri�cation of newly
introduced model called Abstract Description of Interaction for modeling Web Applications.
82
Chapter 6: Benchmarking
This report transforms an abstract model of a web application into Alloy language, and
analyzes it with Alloy Analyzer [3] to demonstrate unwanted behavior of the application.
Simons and Fernandez [82] used Alloy to model-check visual design notations. In their
work, they encoded the abstract syntax of Discovery method [81] into Alloy model, con-
sequently checking it by running trivial predicate for exactly one model instance, thereby
validating the consistency of the model.
6.2.2 Generating Models from Meta-Model Metrics with Alloy
In this section the idea of using a model-checker for instantiating a meta-model is imple-
mented. That is, benchmarking according to given metric values, where metrics choice is
in�uenced by algorithmic considerations, and examined with Weyuker's properties [89] rela-
tively to algorithmic equivalence. As noted by McQuilan and Power [66], defying metrics is a
meta-modeling activity. Hence, the �rst step in generating (other words for �nding) models
is to de�ne a meta model in Alloy. Second, Alloy Analyzer should be asked to �nd a model
with a given values to speci�ed metrics (thus specifying the scope of search). Actually, it is
impossible to �nd an instance for some model without specifying scope and thereby giving
values to metrics. It is also possible to specify exactly how many di�erent instances of a
meta-model should be found, generating appropriate task sample for benchmarking.
Consider the sub-set of class diagram meta-model on �gure 6.1. The metamodel presents
a class class, association, and constraint. Suppose now, we want to generate instances of
this metamodel.
Figure 6.2 shows the metamodel from �gure 6.1 encoded in Alloy Analyzer, with addi-
tional constraints to enforce UML syntax1:
The meta-model can be produced in two ways:
− The �rst, straightforward way, like the one in �gure 6.2 is hand-written from scratch
in Alloy Analyzer tool. In this case we must write all parts of our meta-model with
1For example: A class can't be a superclass of itself
83
Chapter 6: Benchmarking
Figure 6.1: Class Diagram Meta-Model
needed constraints as alloy facts, and then �nd instances.
− The second, optimized way can be done with UML2Alloy [5] tool. UML2Alloy supports
transformation of UML class diagrams, accompanied with OCL constraints. On the
one hand, only part of Class diagram and OCL languages supported by the automatic
transformation with the tool. Therefore, for complicated model �nding the �rst way
of writing meta-model may be adopted. On the other hand, it may be very e�ective
to use class diagrams with OCL for known and checked models of interest due to OCL
widespread.
Figure 6.3 de�nes an empty predicate named NCM which satisfaction is trivial by itself.
When applying run command on the NCM predicate we get the following result on �gure
6.4:
It should be noted here, that Alloy Analyzer provides several useful formats as output
for the found instances. This is extremely useful in the automation context. The output
can be in the following formats:
1. Visual (like on �gure 6.4)
84
Chapter 6: Benchmarking
Figure 6.2: A partial meta-model of UML inAlloy Analyzer.
Figure 6.3: Instance �nd-ing.
2. Tree view browsable (like on �gure 6.5)
3. Textual:
(a) XML
(b) DOT format
With the model de�ned above and the predicate NCM de�ned, we can generate a model
along speci�ed metric values. The following metrics supported:
1. Number of classes
2. Number of associations
3. Number of constraints. A constraint can be de�ned in a generic way, depending on
what elements it is imposed on. For example multiplicity constraint on association.
In the next section we classify the metrics into groups, and show how the metrics from the
example above, as well as more complex metric types can be supported for model generation.
85
Chapter 6: Benchmarking
Figure 6.4: Instance of the speci�ed metamodel
6.2.2.1 A General Process for Metric Speci�cation for Model Generation
1. Specify the meta-model in Alloy Analyzer, according to the guidance presented earlier.
2. Select a metric suite.
3. Specify an empty Generate predicate.
4. Use run command, specifying number of instances to generate and exact or upper
bounds for every element chosen earlier.
6.3 Automation: A Language For Metrics Values De�nition
In the rest of this chapter we set up a basis for a functional language for metrics values
de�nition. The metrics are classi�ed into patterns and a general form for each pattern is
developed. Consider the following example for using a language for creating a benchmark
for FiniteSat :
The language consists of:
− Number-Of-Classes(number) command. Where number is the metric value
86
Chapter 6: Benchmarking
Figure 6.5: Tree view of Instance of the speci�ed metamodel
− Number-Of-Associations(number) command.
− Number-Of-Non-Trivial-Multiplicity-Constraints(number) command.
This way, to perform again the study for �nite satis�ability occurrence and relevance we
could use the above commands which re�ect metrics values for creating the problem sample.
It is important to note why the Alloy language is needed at �rst place. Why just not to
use OCL instead? There is another reason more than just the fact that Alloy Analyzer is a
model checker, and OCL is just a speci�cation language with several supporting applications.
As shown in this chapter, there is a need to have variables on models elements which is
impossible in OCL, that does not have any metadata. The only variable in OCL is available
for instances , which is not su�cient.
6.3.1 Metrics classi�cation
Since we believe that Alloy is needed, and we know that it is possible to generate models
along provided metrics values it would be useful to have some framework for model gener-
ation using metrics with Alloy. In order to do so, we attempt to classify the metrics into
exclusive groups, with a speci�c guidance of generating models through the metrics in a
group.
87
Chapter 6: Benchmarking
6.3.1.1 Pattern 1. Regular Size Metrics.
Regular size metrics, are metrics where number of instances for a speci�c class is explicitly
speci�ed. Examples for such metrics are well known Number of Classes in a Model when
measuring a class diagram introduced by Kim and Boldyre� [52], as well as introduced earlier
by Chidamber and Kemerer [27] Weighted Methods per Class when measuring a single class
metric.
These kind of metrics have a common pattern that looks like:
− NUMBER OF ? OBJECTS = X
Where the symbol ? stands for the name of the class. The X symbol stands for the
numeric value of the metric.
For regular size metrics, we must encode the needed meta-model into Alloy Analyzer
and specify the exact2 number of meta-model elements to generate as well as the number of
instances to generate in the overall process.
Figure 6.6: Example of UML meta-model with two elements : X and Y
Figure 6.7: Example of Alloy-written meta-model with two elements : X and Y
Consider the meta-model in �gure 6.6. The same metamodel written with Alloy is
2or upper bound
88
Chapter 6: Benchmarking
speci�ed on �gure 6.7. The meta-model contains two class elements X and Y , such that
each instance of Y associated with exactly one instance of X via association field. By
specifying the Generate predicate and applying the run command, we ask Alloy Analyzer
to generate one instance of the metal model, such that there is one instance of X and two
instances of Y .
Figure 6.8: A Generated instance of the meta-model
The result of such generation is shown in �gure 6.8. Following the general process
of generating models by metrics, we summarize the process of generation instances along
regular size metrics with the following steps:
1. Specify the meta-model in Alloy Analyzer.
2. Select the Metric suite. That is, select the classes which number of instances you want
to specify in the generated model.
3. Specify an empty Generate predicate.
89
Chapter 6: Benchmarking
4. Use run command, specifying number of instances to generate and exact or upper
bounds for every element chosen earlier.
6.3.1.2 Pattern 2. Ratio Metrics.
Ratio metrics specify some fraction or factor, often in form of percentage for a sub-group
inside a whole, usually bigger group. For example, consider the ratio metric NASMNCM inroduced
in [59], and method hiding factor proposed by Abreu and Melo in [1]. These metrics sets the
ratio between the number of classes to number of associations in a model, and between the
number of private access methods to number of public methods within a class, accordingly.
These kind of metrics have a common pattern that looks like:
− RATIO OF AB = X
Where A and B symbols are regular size metrics, and the symbol X stands for the ratio
metric value.
Consider the meta-model in �gure 6.9. The same metamodel written with Alloy is
speci�ed on �gure 6.10. Suppose we want to set value for a metric XY = 1
2 , that is, the
number of X instances is double then number of Y instances. In order to set value to this
metric, we use the fact keyword of Alloy.
Figure 6.9: Example of UML meta-model with two elements : X and Y
We now summarize the framework of generating models along these kind of metrics in
the following steps:
1. Specify the meta-model in Alloy Analyzer.
2. Create appropriate fact statement, as shown on Figure 6.10 to specify the ratio value.
90
Chapter 6: Benchmarking
Figure 6.10: Example of meta-model with two elements : X, Y with 1:2 ratio between them
3. Specify an empty Generate predicate.
4. Use run command, specifying number of instances to generate and exact or upper
bounds for every element chosen earlier.
The concept of ratio metrics is orthogonal to the size metrics, and can be applied together
generating more accurate models. Note, that when using Alloy, the values for regular size
metrics must be also provided for generating the model. The generation of models along
ratio metrics is a little complicated. You can't just put one metric value here and another
metric value there, assuming everything will be generated properly. A con�ict between size
metrics to ratio metric can occur, if the size metric values are explcit and no model with
ratio speci�ed can not be found. For example, asking to generate a model with exactly 10
classes, exactly 3 associations and NASMNCM = 1
2 is impossible.
6.3.1.3 Pattern 3. Quanti�ed Metamodel Association Restriction.
A good way to understand this type of metrics is by example. Consider the NOC metric
proposed by Chidamber and Kemerer [27]. The NOC metric speci�es number of children
for a class. That is, we want to quantify the size of the subclass relationship for a class, like
on �gure 6.1.
These kind of metrics have a common pattern that looks like:
91
Chapter 6: Benchmarking
− NUMBER OF A FOR A B = X
Where A is the type of the association we restrict, B is the context class, which asso-
ciation is being restricted , and X is the metric value. Taking the example above we get
NUMBER OF CHILDREN FOR A CLASS = X .
We proceed with the example from previous section - writing the metamodel in �gure
6.1 with Alloy is given in �gure 6.11.
Figure 6.11: Example of partial UML meta-model in Alloy
Applying run command as shown, we get the generated model as shown on �gure 6.12.
The moel generated has exactly 10 classes, where every class has exactly one sub-class. That
is, we explicitly speci�ed the NOC value for all classes in a model.
We summarize rpoducing model along metrics of Pattern 3 in the following steps:
1. Specify the meta-model in Alloy Analyzer.
2. Specify a fact statement, limiting the size of the elements inside an association.
92
Chapter 6: Benchmarking
Figure 6.12: A Generated model where each class has exactly one sub class.
3. Specify an empty Generate predicate.
4. Use run command, specifying number of instances to generate and exact or upper
bounds for every element chosen earlier.
The metrics which fall into the group of Pattern-3 metrics, are obviously very useful,
they are well known and very cited by researchers recently. However, as it is presented
above, it may not be useful since we probably do not want always to specify the metric for
all the elements marked A above. This observation leads to a re�nment to this group of
metrics.
A Re�nment for Pattern 3.
Figure 6.13 UML and �gure 6.14 in Alloy show a metamodel with Z as a subset of X.
Using this technique we can impose size constraints on a set inside an association. For
example, if we want to generate a model with known number of exactly X instances, each
of which associated with exactly three elements of Y , we de�ne a subset of X (Z on �gure
6.14) and generate explicitly the wanted number of Z elements.
To be more precise, and obtain exact number of Z-elements (that is, elements with three
associated Y elements), we can state by fact that each of the X \Z-elements has a number
of associated Y elements, that is di�erent from three.
Generating instances along the above meta-model we get the following results. Figure
6.15 shows the �rst instance with two Z-elements.
93
Chapter 6: Benchmarking
Figure 6.13: Example of meta-model with three elements : X, Y and Z
Figure 6.14: Example of meta-model with three elements : X, Y and Z
However, as noted above, it is not only one instance of the meta-model that can be
generated. Alloy-Analyzer searches the entire speci�ed search space, and if needed, another
instance can be obtained. For example, �gure 6.16 demonstrates another instances of the
same meta-model as �gure 6.15 and same metric values.
We now summarize the framework of generating models along these kind of metrics in
the following steps:
1. Specify the meta-model in Alloy Analyzer.
2. Specify a subset element for the desired element to impose a limit on, as shown at
�gure 6.14.
3. Specify a fact statement, limiting the size of the elements inside an association.
4. Optionally, for even greater re�nment, specify explicitly fact{all element : X-Z | #
94
Chapter 6: Benchmarking
Figure 6.15: A Generated instance of the meta-model
element.�eld != 3}, the excluding group, obtaining exact number of Z-elements.
5. Specify an empty Generate predicate.
6. Use run command, specifying number of instances to generate and exact or upper
bounds for every element chosen earlier.
Pattern 3.Re�nment 2 .
Observe, that the �rst re�nment only promises that the wanted number of object will
be generated. However, this is only an in�mum for that number. It is possible to impose a
strict limit on the number of earlier referred Z-objects by constraining the excluding group:
fact{all element : X-Z | # element.�eld != 3}
6.3.1.4 Pattern 4. General Structure Metrics.
General structure metrics are probably very complicated type of metrics. Such metrics are
not often easily extracted, and may need some non-trivial algorithm in order to extract them
95
Chapter 6: Benchmarking
Figure 6.16: A Generated instance of the meta-model
from a given model. Generating models that have some speci�c structure which is de�ned
along structure metrics is not a trivial task.
Examples for general structure metrics are:
− Number of Cycles in a Model (NCYC) [59].
− Path Length between Classes.
− Average,Longest,Lowest Depth of Inheritance Tree (DIT) [27]
Note, that in light of the method presented in this report, general structure metrics
alone are not enough to generate a model. Structure metrics should be interleaved with
other metric types in general and with size metrics in particular.
We now summarize the framework of generating models along these kind of metrics in
the following steps:
1. Specify the meta-model in Alloy Analyzer.
2. Create appropriate fact statements, which characterize the metric values.
96
Chapter 6: Benchmarking
3. Specify an empty Generate predicate.
4. Use run command, specifying number of instances to generate and exact or upper
bounds for every element chosen earlier in the meta-model speci�cation step.
97
Chapter 7
Conclusions and Future Work
In this thesis, we developed a method for automatic metric driven benchmark creation for
model correctness algorithms. We also extended methods for detecting and and addressing
problems of �nite satis�ability in UML class diagrams in a way that is simple and e�cient
and that provides the foundations for expanding UML CASE tools to address these �nite
satis�ability problems. Furthermore, this thesis made a big contribution on showing the
developed methods' applicability and scalability, by developing a platform that supports
our methods. A basis for further benchmarking is set with a series of experiments. Finally,
a basis for function metric-driven language, for benchmarking speci�cation and creation is
set up. This research can be expanded in a straightforward way to the following directions:
1. Developing a fully functional language for metrics description, based on metrics pat-
terns. As a result a proprietary language for model level metrics might be developed.
2. Developing an engine, for automatically translating (compiling) the language into a
model checker, such as Alloy.
3. Developing metrics which describe hierarchical data, with possibly unbounded size and
recursive de�nition. For example: graphs, cycles, lists etc,.
4. During this thesis, we noticed that it is very di�cult to describe complex hierarchical
98
Chapter 7: Conclusions and Future Work
data structures with Alloy. In future it is important to examine other model checkers,
which can have more expressive language or build speci�c target language based on
Alloy.
Further research directions include:
1. Developing metrics for other types of models. For example, metrics for sequence
diagrams.
2. Developing metrics for interaction and composition of di�erent model types.
3. Explore the connection between metrics and complexity. Study models complexity,
motivated by metrics.
99
Appendix A
Reasoning Infrastructure
Implementation
A.1 Implementation of FiniteSat Algorithm
Today, there is a quiet solid base of theoretical knowledge about reasoning tasks on UML
class diagrams studied in [63, 61, 62, 56, 50, 9, 42, 22, 23, 24] any many others. However,
only a very small part of this research was tested on practice and implemented. Most of
the implemented methods were partial on prototype level or used di�erent and tricky pro-
gramming techniques then those studied in theory [21, 61, 48]. In this work we present an
implementation of the methods described in previous chapters. The methods implemented
rely heavily on USE [37] software and make use of linear programming methods, particularly
on the well known Simplex algorithm (for further reading see [67, 29]).
The main goal of this implementation was to create an infrastructure for reasoning algo-
rithms. Put otherwise, a piece of software that is carefully designed and implemented for its
current needs and future extensions. To achieve this goal several steps were taken. Above
steps include : using the Java language and the Eclipse platform [34], using ANTLR [75]
compiler-compiler to allow future features and �exible OO design. The choices above were
100
Appendix A: Reasoning Infrastructure Implementation
made because their popularity and e�ective implementation experience.
The secondary goal was to establish a basis for future benchmarking of such reasoning
algorithms and check the scalability of our methods.
A.2 Implementation in Detail
In this section we present our implementation in detail. The tool we implemented supports
the following constraints imposed on class diagrams that cause �nite satis�ability problem:
1. Multiplicity constraints
2. Class hierarchy constraints interaction with associations
3. Constrained generalization sets
4. Quali�ed associations
Following the ideas of [61], our tool makes 3 distinct steps:
1. Input Processing (Complete Model Creation)
2. Inequalities System Creation
3. Solving the Inequalities System
We will now discuss in detail each of the 3 steps mentioned above.
1. Input Processing (Complete Model Creation). Most (and maybe all) of the
CASE tools available today to the software analyst or designer, save UML models in
a symbolic representation via XMI format. XMI is a special form of XML, where all
data about the UML model, as well as its visual representation on the screen is saved.
Unfortunately, there is a main problem with this matter. The problem is that many
di�erent CASE tools (take for example: Poseidon, ArgoUML, Rational Rose or Visual
Paradigm) use di�erent proprietary formats and conventions of XMI. So, a decision
101
Appendix A: Reasoning Infrastructure Implementation
Figure A.1: Reasoning Tool Internal Structure
has to be made. To overcome all this variety of formats, we've chosen to use the USE
[37] format to store the symbolic representation of UML models. We advocate for
this choice with a partially ready preprocessing implemented by [37], very comfortable
and readable textual notation in addition to visualization options. We summarize the
extended grammar of USE for our tool by the following grammar in EBNF [39] style:
UMLModel → Name ModelBody
Name → IDENT
ModelBody → (Enumeration)* (Class)* (Association)* (GS)*
Enumeration → "enum" Name "{" Name+ "}"
Class → ("abstract")? "class" Name "attributes" (Attribute)+
Association → ("association" |"composition" |"aggregation") Name "between" Asso-
ciationEnd AssociationEnd
AssociatioEnd → Name Multiplicity "role" Name Quali�er
Quali�er → "quali�er" "attributes" (Attribute)+
Attribute → IDENT ":" Type
GS → "gs" GsName GsType Super Name+
GsName → ("name" IDENT)?
102
Appendix A: Reasoning Infrastructure Implementation
GsType → ("type" ("overlapping" |"complete" |"disjoint" |"incomplete" |
"overlappingComplete" |"overlappingIncomplete" |"disjoIntcomplete" |
"disjointinComplete"))?
Super → ("super" IDENT)
SubList → ("subClasses" Name+)
IDENT →(a..z |A..Z) (a..z |A..Z |0..9)*
Multiplicity → "[" DIGIT |* (.. DIGIT |*)? "]"
In this way a UML model can be speci�ed manually or generated automatically using
these simple, intuitive and well de�ned grammar. Appropriate changes are made to
org.tzi.use.uml.mm package of [37]. Once the input is read, a model representation in
form of Java objects, is created. The model forms naturally into objectal structure
following the UML meta model rules [72]. The main addition to the initial grammar is
the constraints mentioned above, that can cause �nite satis�ability problem if interact
or con�ict.
2. Inequalities System Creation The next step in our tool is the creation of linear
inequalities. The inequalities are created to re�ect the structure of the class diagram
and all the constraints imposed on it. The inequalities are not created all together,
but separately by their type:
(a) Inequalities to re�ect classes in the model
(b) Inequalities to re�ect class hierarchy
(c) Inequalities to re�ect associations with multiplicity constraints
(d) Inequalities to re�ect quali�ed associations
(e) Inequalities to re�ect generalization set's constraints
The �nal output is an instance for optimization problem that can be solved by a linear
programming tool via simplex method.
103
Appendix A: Reasoning Infrastructure Implementation
3. Solving the Inequalities System To solve the linear programming problem we
use an open source Java tool for operations research [73]. The inequalities system
constructed above in the form of linear programming problem instance is checked for
the presence of solution subject to constraints. The later constraints are exactly the
linear inequalities system created.
As mentioned above, we use the Simplex algorithm introduced by Danzig [29] in order
to solve linear programming problem. Although the running time of this algorithm is
known to be exponential in contrast to Karmarkar's algorithm [51] with linear running
time from theoretical point of view, it was shown to have remarkable performance and
ease of implementation on practice.
A.3 Structural Architecture and Conclusions
The implementation we presented in this chapter can serve as an infrastructure for reasoning
software due to its �exibility. For the best of our knowledge, it is the �rst scalable imple-
mentation handling the necessary amount of constraint and properties of class diagrams.
We summarize the structure of the implemented tool below. On �gure A.2 the structural
architecture of our tool is presented. The �ow shown on �gure A.2 demonstrates in zoom
Figure A.2: The structural architecture of reasoning tool
the steps withing our tool. The symbolic representation is scanned and AST model created.
104
Appendix A: Reasoning Infrastructure Implementation
Later, it is transformed into a java objectal form to represent UML meta-model structure
via Model-to-Model transformation part. The inequalities system is then constructed and a
search for a solution started. If the solution exists we report �nite satis�ability, otherwise
we report the model is not �nitely satis�able. On �gure A.3 a class diagram representing
Figure A.3: Class diagram of our tool's static structure
the static structure of the implemented tool is presented. We present it in this work in
order to stimulate further extension of the tool, and demonstrate the �exibility of its design.
For example, in order to add another constraint of class diagram to extend the FiniteSat
algorithm, one needs only to implement another specialization of InequalitiesCreator class
and invoke it. There is no any other change needed. Another point is that several tests and
experiments can be easily made due to the current structure. For example, it is possible to
check whether the tool's performance considering di�erent sets of constraints, by simply not
invoking appropriate inequalities creator.
105
References
References
[1] Fernando Brito e. Abreu and Walcelio Melo. Evaluating the impact of object-oriented
design on software quality. In METRICS '96: Proceedings of the 3rd International
Symposium on Software Metrics, page 90, Washington, DC, USA, 1996. IEEE Computer
Society.
[2] Fernando Brito e Abre Aline Lúcia Baroni. A formal library for aiding metrics ex-
traction. In International Workshop on Object-Oriented Re-Engineering at ECOOP,
2003.
[3] Alloy Analyzer. Alloy analyzer web site : http://alloy.mit.edu. 2010.
[4] Kyriakos Anastasakis. Uml2alloy reference manual. 2009.
[5] Kyriakos Anastasakis, Behzad Bordbar, Geri Georg, and Indrakshi Ray. Uml2alloy: A
challenging model transformation. In In: ACM/IEEE 10th International Conference on
Model Driven Engineering Languages and Systems (MoDELS, pages 436�450. Springer,
2007.
[6] P. Andre, A. Romanczuk-Requile, J.-C. Royer, and Vasconcelos. Checking the consis-
tency of uml class diagrams using larch prover. In The third Rigorous Object-Oriented
Methods Workshop, 2000.
[7] A. Artale, D. Calvanese, R. Kontchakov, V. Ryzhikov, and M. Zakharyaschev. Com-
plexity of reasoning in entity relationship models. In 20th International Workshop on
Description Logics (DL-2007), 2007.
[8] K. Baclawski, M. Kokar, J. Smith, and J. Letkowski. Consistency checking of ontologies
expressed in uml. In International Conference on Formal Ontologies in Information
Systems, 2001.
106
References
[9] M. Balaban and A Maraee. Consistency of uml class diagrams with hierarchy con-
straints. In Next Generation Information Technologies and Systems, pages 71�82, Lon-
don, UK, 2006. Springer-Verlag.
[10] M. Balaban and P. Shoval. Meer � an eer model enhanced with structure methods.
information systems. Advanced Topics in Database Research, 2002.
[11] Mira Balaban and Azzam Maraee. Finite satis�ability of uml class diagrams with
constrained class hierarchy. In (To Appear).
[12] Mira Balaban, Azzam Maraee, and Arnon Sturm. Management of correctness problems
in uml class diagrams � towards a pattern-based approach. In International Journal of
Information System Modeling and Design, 2010.
[13] A. Baroni. A formal de�nition object-oriented design metrics. In Master Thesis, Vrije
Ujiversitiet Brussel, 2002.
[14] A. Baroni, S. Braz, and F. Abreu. Using OCL to formalize object-oriented design met-
rics de�nitions. In ECOOP'02 Workshop on Quantitative Approaches in OO Software
Engineering, 2002.
[15] D. Berardi, D. Calvanese, and De. Giacomo. Reasoning on uml class diagrams. Arti�cial
Intelligence, 168(1):70�118, 2005.
[16] Behzad Bordbar and Kyriakos Anastasakis. Mda and analysis of web applications.
In Proceeding of VLDB Workshop on Trends in Enterprise Application Architecture
(TEAA 2005) Lecture notes in Computer Science, volume 3888, pages 44�45, 2005.
[17] F. Boufares and H. Bennaceur. Consistency problems in er-schemas for database sys-
tems. Information Sciences, 163(4):263�274, 2004.
[18] M. Bunge. Treatise on Basic Philosophy: Ontology 1: The Furniture of the World.
Riedel, Boston, 1977.
107
References
[19] M. Bunge. Treatise on Basic Philosophy: Ontology 2: The World of Systems. Riedel,
Boston, 1979.
[20] M. Cadoli, D. Calvanese, G. De Giacomo, and T. Mancini. Finite satis�ability of uml
class diagrams by constraint programming. In The Workshop on CSP Techniques with
Immediate Application, 2004.
[21] M. Cadoli, D. Calvanese, G. De Giacomo, and T. Mancini. Finite model reasoning on
uml class diagrams via constraint programming. In The s10th Congress of the Italian
Association for Arti�cial Intelligence, pages 36�47. Springer, 2007.
[22] D. Calvanese. Finite model reasoning in description logics. In The 5th Int. Conf. on
the Principles of Knowledge Representation and Reasoning (KR-96), pages 292�303,
California, 1996. Morgan Kaufmann.
[23] D. Calvanese, G. De Giacomo, M. Lenzerini, D. Nardi, and R. Rosati. Description logic
framework for information integration. In The Sixth International Conference on the
Principles of Knowledge Representation and Reasoning (KR'98), pages 2�13, 1998.
[24] D. Calvanese and M. Lenzerini. On the interaction between isa and cardinality con-
straints. In The 10th IEEE Int. Conf. on Data Engineering, pages 204�213, Washington,
DC, USA, 1994. IEEE Computer Society.
[25] P. P. Chen. The entity-relationship model-toward a uni�ed view of data. In ACM
Transaction on Database Systems, 1(1), 9-36,, 1976.
[26] Chidamber and Kemerer. A metrics suite for object oriented design. IEEE Transactions
on Software Engineering, 20(6):476 � 493, 1994.
[27] S. R. Chidamber and C. F. Kemerer. A metrics suite for object oriented design. IEEE
Trans. Softw. Eng., 20(6):476�493, 1994.
108
References
[28] Hausteiny S.: Purvis M. Crane�eld, S. Uml-based ontology modelling for software
agents. In Proc. of Ontologies in Agent Systems Workshop, Montreal, 2001.
[29] G.B. Danzig. Linear Programming and Extensions. Reading, Princeton University
Press, 1963.
[30] Udo Kelter Dirk Ohst, Michael Welle. Merging uml documents. 2004.
[31] J. Dullea and I. Song. An analysis of structural validity of ternary relationships in
entity-relationship modeling. In In Proceedings of Seventh International Conference on
Information and Knowledge Management (CIKM `98), pages 331�339, 1998.
[32] J. Dullea, I. Song, and I. Lamprou. An analysis of structural validity in entity-
relationship modeling. Data and Knowledge Engineering, 47 (2):167�205, 2003.
[33] P. Erdos and A. Renyi. On random graphs. In Publ. Math. Debrecen, volume 6, pages
290�297, 1959.
[34] The Eclipse Foundation. http://www.eclipse.org. 2008.
[35] M. Fowler. Refactoring: improving the design of existing code. Addison-Wesley Long-
man Publishing Co., Inc., Boston, MA, USA, 1999.
[36] Marcela Genero, Mario Piattini, and Coral Caleron. A survey of metrics for uml class
diagrams. Journal of Object Technology, 4:59�92, 2005.
[37] Martin Gogolla, Fabian Buttner, and Mark Richters. Use: A uml-based speci�cation
environment for validating uml and ocl. Science of Computer Programming, pages
69:27�34, 2007.
[38] Michael Goodrich and Roberto Tamassia. Algorithm Design. John Wiley and Sons,
Inc., 2002.
109
References
[39] D. Grune, H. Bal, C. Jacobs, and K. Langendoen. Modern Compiler Design. John
Wiley and Sons, 2000.
[40] Wagner G. Guarino N. van Sinderen Guizzardi, G. M.: An ontologically well-founded
pro�le for uml conceptual models. In 16th International Conference on Advanced In-
formation Systems Engineering (CAiSE), Latvia, 2004.
[41] T.A. Halpin. A logical analysis of information systems: Static aspects of the data-
oriented perspective, ph.d. thesis, department of computer science, university of queens-
land, brisbane, australia. 1989.
[42] S. Hartmann. Graph-theoretical methods to construct entity-relationship databases. In
The 21st International Workshop on Graph-Theoretic Concepts in Computer Science,
pages 131�145, London, UK, 1995. Springer-Verlag.
[43] S. Hartmann. On the consistency of int-cardinality constraints. In Proceedings of the
17th International Conference on Conceptual Modeling, pages 150�163, London, UK,
1998. Springer-Verlag.
[44] S. Hartmann. Coping with inconsistent constraint speci�cations. In The 20th In-
ternational Conference on Conceptual Modeling, pages 241�255, London, UK, 2001.
Springer-Verlag.
[45] S. Hartmann. On the implication problem for cardinality constraints and functional
dependencies. Annals of Mathematics and Arti�cial Intelligence, 33(2-4):253�307, 2001.
[46] Daniel Jackson. Alloy: a lightweight object modelling notation. ACM Trans. Softw.
Eng. Methodol., 11(2):256�290, 2002.
[47] Yan-Bing Jiang, Wei-Zhong Shao, Zhi-Yi Ma, and Yao-Dong Feng. On the formalized
semantics of static modeling elements in uml. In Proceedings of the 4th International
110
References
Conference on Formal Engineering Methods: Formal Methods and Software Engineer-
ing, pages 500 � 510. Springer-Verlag London, UK, 2002.
[48] Daniel Riera Jordi Cabot, Robert Claris. Umltocsp: a tool for the formal veri�cation
of uml/ocl models using constraint programmingdemonstration at the 22th int. conf.
on automated software engineering (ase'07). 2007.
[49] K. Kaneiwa and K Satoh. Consistency checking algorithms for restricted uml class di-
agrams. in proceedings of the fourth international symposium on foundations of infor-
mation and knowledge systems (foiks2006), lncs 3861, springer-verlag, 219-239. (2006).
[50] K. Kaneiwa and S. Satoh. Consistency checking algorithms for restricted uml class
diagrams. In In Proceedings of the Fourth International Symposium on Foundations of
Information and Knowledge Systems, 2006.
[51] Narendra Karmarkar. A new polynomial time algorithm for linear programming. Com-
binatorica, pages 373 � 395, 1984.
[52] Hyoseob Kim and Cornelia Boldyre�. Developing software metrics applicable to uml
models. In 6th ECOOP Workshop on Quantitative Approaches in Object-Oriented Soft-
ware Engineering, 2002.
[53] C. Lange, M. Chaudron, and Muskens. J. In practice: Uml software architecture and
design description. IEEE Software, 23(2), 2006.
[54] K. Lano, D. Clark, and K. Androutsopoulos. Formal veri�cation of object-oriented
models. in integrated formal methods. In Integrated Formal Methods: 4th International
Conference, IFM 2004, 2004.
[55] Craig Larman. Applying UML and Patterns. Prentice-Hall,Inc., 2002.
[56] M. Lenzerini and P. Nobili. On the satis�ability of dependency constraints in entity-
relationship schemata. Information Systems, 15(4):453�461, 1990.
111
References
[57] Wei Li and Sallie Henry. Object-oriented metrics that predict maintainability. J. Syst.
Softw., 23(2):111�122, 1993.
[58] C. Lutz, U. Sattler, and L. Tendera. The complexity of �nite model reasoning in
description logics. Inf. Comput., 199(1-2):132�171, 2005.
[59] Victor Makarenkov, Pavel Jelnov, Azzam Maraee, and Mira Balaban. Finite satis�a-
bility of class diagrams: practical occurrence and scalability of the �nitesat algorithm.
In MoDeVVa '09: Proceedings of the 6th International Workshop on Model-Driven En-
gineering, Veri�cation and Validation, pages 1�10, New York, NY, USA, 2009. ACM.
[60] Ma Esperanza Manso, Marcela Genero, and Mario Piattini. No-redundant metrics
for uml class diagram structural complexity. In Lecture Notes in Computer Science :
Advanced Information Systems Engineering, 2003.
[61] A. Maraee. E�cient methods for solving �nite satis�ability problems in uml class
diagrams. Master's thesis, Ben-Gurion Univ., Israel, 2007.
[62] A. Maraee and M Balaban. E�cient reasoning about �nite satis�ability of uml class
diagrams with constrained generalization sets. In The 3rd European Conference on
Model-Driven Architecture, pages 17�31. Springer, 2007.
[63] A. Maraee and M Balaban. A uml-based method for deciding �nite satis�ability in
description logics. In The 21st International Workshop on Description Logics, DL2008,
Dresden, Germany, 2008.
[64] A. Maraee and M Balaban. E�cient recognition of �nite satis�ability in uml class
diagrams: Strengthening by propagation of disjoint constraints. In The Second In-
ternational Conference on Model Based Systems Engineering, MBSE09, Haifa, Israel,
2009.
112
References
[65] A. Maraee, V. Makarenkov, and M. Balaban. E�cient recognition and detection of �nite
satis�ability problems in uml class diagrams: Handling constrained generalization sets,
quali�ers and association class constraints. In 1st International Workshop on "Model co-
evolution and consistency management", ACM/IEEE 119th International Conference
on Model Driven Engineering Languages and Systems (MoDELS'08), 2008.
[66] Jacqueline A. McQuillan and James F. Power. Towards re-usable metric de�nitions
at the meta-level. In Towards re-usable metric de�nitions at the meta-level, Nantes,
France, 2006.
[67] Nimrod Megiddo. Linear programming. Encyclopedia of Microcomputers, 1991.
[68] T. Mens. A state-of-the-art survey on software merging. IEEE Trans. Softw. Eng.,
28(5):449�462, 2002.
[69] Tom Mens and Michele Lanza. A graph-based metamodel for object-oriented soft-
ware metrics. Electronic Notes in Theoretical Computer Science, 72(2):57 � 68, 2002.
GraBaTs 2002, Graph-Based Tools (First International Conference on Graph Transfor-
mation).
[70] M. Minsky. A framework for representing knowledge. technical report. In UMI Order
Number: AIM-306., Massachusetts Institute of Technology, 1974.
[71] Sanjay Misra and Ibrahim Akman. Applicability of weyuker's properties on oo met-
rics: Some misunderstandings. Computer Science and Information Systems, 5(1):17�23,
2008.
[72] OMG. The Uni�ed Modeling Language (OMG UML), Superstructure, V2.3. 2010.
[73] OPSRESEARCH. http://opsresearch.com. 2008.
[74] Liang. P. Formalization of static and dynamic uml using algebraic. Master's thesis,
University of Brussel, 2001.
113
References
[75] Terence Parr. Antlr parser generator. www.antlr.org. 2008.
[76] Julia Rubin, Marsha Chechik, and Steve M. Easterbrook. Declarative approach for
model composition. In MiSE '08: Proceedings of the 2008 international workshop on
Models in software engineering, pages 7�14, New York, NY, USA, 2008. ACM.
[77] J. Rumbaugh., G. Jacobson, and G. Booch. The Uni�ed Modeling Language Reference
Manual Second Edition. Adison Wesley, 2004.
[78] A. Schild. A correspondence theory for terminological logics:preliminary report. Tech-
nical Report KIT-BACK, FR 5-12, 1991.
[79] Seyyed M. A. Shah, Kyriakos Anastasakis, and Behzad Bordbar. From uml to alloy
and back again. In MoDeVVa '09: Proceedings of the 6th International Workshop on
Model-Driven Engineering, Veri�cation and Validation, pages 1�10, New York, NY,
USA, 2009. ACM.
[80] Susan Elliott Sim, Steve Easterbrook, and Richard C. Holt. Using benchmarking to
advance research: a challenge to software engineering. In ICSE '03: Proceedings of the
25th International Conference on Software Engineering, pages 74�83, Washington, DC,
USA, 2003. IEEE Computer Society.
[81] Anthony J. H. Simons. Object discovery: A process for developing applications. 1998.
[82] Anthony J. H. Simons and Carlos Alberto Fernandez y. Fernandez. Using alloy to
model-check visual design notations. In ENC '05: Proceedings of the Sixth Mexican
International Conference on Computer Science, pages 121�128, Washington, DC, USA,
2005. IEEE Computer Society.
[83] M. Szlenk. Formal semantics and reasoning about uml class diagram. In The Interna-
tional Conference on Dependability of Computer Systems, pages 51�59, 2006.
114
References
[84] B. Thalheim. Fundamentals of entity-relationship modeling. annals mathematics and
arti�cial intelligence, 7, 197-256.
[85] B. Thalheim. Fundamentals of cardinality constraints. In The 11th International Con-
ference on the Entity-Relationship Approach, pages 7�23, London, UK, 1992. Springer-
Verlag.
[86] B. Thalheim. Entity Relationship Modeling, Foundation of Database Technology.
Springer-Verlag, 2000.
[87] Walter F. Tichy. Should computer scientists experiment more? Computer, 31(5):32�40,
1998.
[88] B. Unhelkar. Veri�cation and Validation for Quality of UML 2.0 Models. Addison-
Wesley, 2005.
[89] E. J. Weyuker. Evaluating software complexity measures. IEEE Trans. Softw. Eng.,
14(9):1357�1365, 1988.
[90] Alanna Zito and Juergen Dingel. Modeling uml2 package merge with alloy. In First Alloy
Workshop, colocated with the Fourteenth ACM SIGSOFT Symposium on Foundations
of Software Engineering, 2006.
115