On behalf of CMS Collaboration Lassi A. Tuura Northeastern University, Boston
description
Transcript of On behalf of CMS Collaboration Lassi A. Tuura Northeastern University, Boston
ACAT 2002ACAT 2002
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
IgnominyIgnominyTool for Analysing Software Dependencies and Tool for Analysing Software Dependencies and
For Reducing Complexity in Large Software For Reducing Complexity in Large Software SystemsSystems
On behalf of CMS CollaborationOn behalf of CMS Collaboration
Lassi A. TuuraLassi A. Tuura
Northeastern University, Boston
2June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
MotivationMotivation IGUANA is largely an integrator for CMS: we need to have a
good grasp of the external software before its inclusion into our system By and large we are not seeking to select one product… but are trying to merge the strengths of several packages into a very good
physics analysis environment… and are seeking to provide feedback to component authors
We are interested in, among others: How much of the external package we would use Its impact on our physical software structure How well it fits in with the philosophy of CMS software and other imports—in
design and architecture, usage patterns, GUI, … What other software it depends in Commitment required, possibility of varying how much we use
3June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
ExamplesExamplesSee http://iguana.cern.ch/3_1_0/dependencies.html
4June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
ignominy: dishonour, disgrace, shame; infamy; the condition of being in disgrace, etc.
(Oxford English Dictionary)
IgnominyIgnominy Model
Examines and reports on direct and transitive source and binary dependencies
Creates reports of the collected results As a set of web pages Numerically Graphically As tables
SourceCode
BuildProducts
Metrics
Graphs
Tables
DependencyDatabase
User-definedlogical dependencies
+
ignominy: a suite of perl and shell scripts plus a number of configuration files (IGUANA)
5June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Dependency AnalysisDependency Analysis Ignominy scans…
Make dependency data produced by the compilers (*.d files) Source code for #includes (resolved against the ones actually seen) Shared library dependencies (“ldd” output) Defined and required symbols (“nm” output)
And maps… Source code and binaries into packages #include dependencies into package dependencies Unresolved/defined symbols into package dependencies
And warns… about problems and ambiguities (e.g. multiply defined symbols or dependent shared libraries not found)
Produces a simple text file database for the different dependencies: source only, binaries only, combined, forward and reverse, by package, by domain, …
6June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Single Package DependenciesSingle Package Dependencies
Cmscan/IgCmscanTesting Level: 5Outgoing edges: 6- from includes: 6 (145 files)- from symbols: 4 (636 symbols)Incoming edges: 1- from includes: 1 (1 file)- from symbols: 1 (1 symbol)
7June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Domain Test PlanDomain Test Plan
8June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Package Impact DiagramPackage Impact Diagram
“Used-by” dependencies
9June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
An Extra DependencyAn Extra DependencyBad dependency in
prototype code; was resolved to be
from bad class placement
1 IgSoReaderAppDriver IgQtTwigBrowservia IgQtTwigModel.h
1 IgSoReaderAppDriver IgQtTwigBrowservia IgQtTwigRep.h
10
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Static vs. LogicalStatic vs. LogicalLogical dependencies from packages used through “Interfaces”
11
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Discovering Forms of ModularityDiscovering Forms of Modularity A fairly good tool for discovering “philosophical structure”
IGUANA and Geant4 mostly use direct abstract interfaces– The interfaces normally generate “correct” functional dependencies:
interface definitions are in packages that obviously imply the function “Plug in one implementation of this interface”
– Some use in Lizard/AIDA and ROOT All interfaces bundled into “interface” (or framework) packages
– Used by Lizard/AIDA and ROOT Explicit dynamic loading to solve modularity issues
– Used extensively by ROOT Fall back on scripts or commands evaluated at run-time
– Some use in Geant4– Used quite a bit in ROOT
12
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of Anaphe Analysis of Anaphe Distribution of tools and utilities for LHC era physics
Combination of commercial, free and HEP software Claims to be a toolkit
Appears to live up to its toolkit claims Good work on modularity Clean design is evident in many places Dependency diagrams often split
naturally into functional units
13
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of ATLASAnalysis of ATLAS Torture-test exercise for the tool
Large release size (~50% F77, ~50% mainly C++ but also C, Java) Near the limit of Ignominy’s ability to discover software structure Pictures below illustrate analysis difficulties
Visible (and known) problems Many cleanly designed packages shadowed by a cycle with very unpleasant
effects on the overall structure A number of places show poor packaging and/or lack of abstract interfaces
Known bybuild system
Misconfiguredanalysis (1.3.2)
Improvedanalysis (1.3.7)
14
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of CMS/ORCA Analysis of CMS/ORCA Large C++ project Deliberately fast development shows in places
Good design in key parts has helped Recognised problems
Especially with the length of the release sequence Clean-up/restructuring necessary soon
– To some extent starting already Large metric fluctuations from version to version
ORCA Visualisation —needs most of the rest
15
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of CMS/COBRA, IGUANA Analysis of CMS/COBRA, IGUANA COBRA
CMS Reconstruction, analysis and simulation framework Recently successfully split off from ORCA Quite many small packages
Has helped with modularity– Some issues with partitioning: some small cycles, certain package groups
appear quite frequently
IGUANA Generic data analysis environment with CMS focus Many fairly small packages with targeted purpose (similar to Anaphe) Project focus as an integrator and glue provider is fairly evident We too have some rats nests to clean up, but at least they are small… Has had the advantage of considerable monitoring!
16
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of Geant4 Analysis of Geant4 Fairly large C++ project
Very fine-grained (and multi-level) package structuring Structure seems quite clean from the preliminary analysis
Fine package subdivision helps in many ways but makes analysis and code understanding more complicated
One subsystemseems stronglycoupled andneeds attention
Need to studythe use of theinternal commandsystem
17
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of ROOTAnalysis of ROOT ROOT developers have done a formidable job of breaking
binary (shared library) dependencies, but… It makes dubious use of its internal scripting facility For example: By static analysis, nothing seems to use the postscript
package directly (no incoming dependencies), but there is this code:void TPad::Print (const char *filename, Option_t *option) { […]
TVirtualPS *psave = gVirtualPS;if (gROOT->LoadClass("TPostScript","Postscript")) return;gROOT->ProcessLineFast("new TPostScript()");gVirtualPS->Open(psname,pstype);gVirtualPS->SetBit(kPrintingPS); […] }
Taking these and global objects into account makes the dependency diagrams very different—and cast doubt on usefulness of binary-only dependency diagrams for ROOT
Sign of fast growth? Need a “next evolutionary step”? So “coherent” that replacing parts could get painful…
18
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Analysis of ROOT…Analysis of ROOT…
Binary only Binary + Source + Logical = Real
19
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Package MetricsPackage MetricsProject Release Packages
Average #of direct
dependencies
Cycles(Packages Involved)
# of levels ACD* CCD* NCCD* Size
Anaphe 3.6.1 31 2.6 -- 8 5.4 167 1.3 630/170kATLAS 1.3.2 230 6.3 2 (92) 96 70 16211 10 1350k
1.3.7 236 7.0 2 (92) 97 77 18263 11 1350kCMS/ORCA 4.6.0 199 7.4 7 (22) 35 24 4815 3.6 420k
6.1.0 385 10.1 4 (9) 29 37 14286 4.9 580kCMS/COBRA 5.2.0 87 6.7 4 (10) 19 15 1312 2.7 180k
6.1.0 99 7.0 4 (8) 20 17 1646 2.9 200kCMS/IGUANA 2.4.2 35 3.9 -- 6 5.0 174 1.2 150/38k
3.1.0 45 3.3 1 (2) 8 6.1 275 1.3 150/60kGeant4 4.3.2 108 7.0 3 (12) 21 16 1765 2.8 680kROOT 2.25/05 30 6.4 1 (19) 22 19 580 4.7 660k*) John Lakos, Large-Scale C++ Programming
Size = total amount of source code (roughly—not normalised across projects!) ACD = average component dependency (~ libraries linked in) CCD = sum of single-package component dependencies over whole release: test cost NCCD = Measure of CCD compared to a balanced binary tree
– < 1.0: structure is flatter than a binary tree (= independent packages)– > 1.0: structure is more strongly coupled (vertical or cyclic)– Aim: Minimise NCCD for given software/functionality (good toolkit: ~ 1.0)
20
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
Metrics: NCCD vs CyclesMetrics: NCCD vs Cycles
0
2
4
6
8
10
12
0% 10% 20% 30% 40% 50% 60% 70%
Fraction of Packages in Cycles
NC
CD
Toolkits &Frameworks
ATLAS
ORCA4
Anaphe IGUANACOBRAG4
ROOTORCA6
21
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
0
2
4
6
8
10
12
0 200 400 600 800 1000 1200 1400 1600
Size (k-lines of source [files])
NC
CD
Metrics: NCCD vs SizeMetrics: NCCD vs Size
Toolkits &Frameworks
ATLAS
ORCA4
AnapheIGUANACOBRA
G4
ROOT
ORCA6
22
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
0
2
4
6
8
10
12
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Av. Component Deps (Fraction of Packages)
NC
CD
Metrics: NCCD vs ACDMetrics: NCCD vs ACD
Toolkits &Frameworks
ATLAS
ORCA
AnapheIGUANACOBRAG4
ROOT
23
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
0
2
4
6
8
10
12
0 0.05 0.1 0.15 0.2 0.25
Av. Immediate Deps (Fraction of Packages)
NC
CD
Metrics: NCCD vs AIDMetrics: NCCD vs AID
Toolkits &Frameworks
ATLAS
ORCA
AnapheIGUANA
COBRA
G4
ROOT
24
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
0
50
100
150
200
250
300
350
400
450
0 200 400 600 800 1000 1200 1400 1600
Size (Own Only)
Pack
ages
Metrics: Packages vs SizeMetrics: Packages vs Size
Toolkits &Frameworks
ATLAS
ORCA6
Anaphe
IGUANACOBRA
G4
ROOT
ORCA4
25
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
0
50
100
150
200
250
300
350
400
450
0 200 400 600 800 1000 1200 1400 1600
Size (All)
Pack
ages
Metrics: Packages vs SizeMetrics: Packages vs Size
Toolkits &Frameworks
ATLAS
ORCA6
AnapheIGUANA
COBRAG4
ROOT
ORCA4
26
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
CaveatsCaveats Ignominy does only static dependencies, not dynamic ones
Indirect calls through pointers, virtual function calls State dependencies: Data reads and writes, thread synchronisation, …
The analysis of external software is heuristic; exact information from the build system helps considerably
Difficulties are posed by copied code (copy and paste or merged libraries) and defaults dependent on link-order (“dummies” that are supposed to be overridden by client) Most headaches so far with FORTRAN code
Ignominy must guess software structure when in doubt Based on project-defined heuristic search rules, usually works fine In face of an ambiguity Ignominy warns and assumes the worst
– Multiply defined symbol: dependency on all definitions– Multiple header matches: dependency on all (but correct with compiler-
generated dependency data!)
27
June, 2002 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch
StatusStatus Run for every IGUANA release as a part of release build Canned configuration for any SCRAM-based project
Needs project specific colouring etc. configurations Works with many other project structures
Tried on G4, ROOT and ATLAS
Plans Consolidate scripts and fold in all the documentation Make it somewhat easier to use and configure Java support with Mark Donszelmann’s jneeds
Available for free at http://iguana.cern.ch/ See the IGUANA distributions (latest = 3.1.0 recommended) Questions? Please mail [email protected] or [email protected]