Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

17
Alberto Ribon Alberto Ribon CERN Geant4 Geant4 Workshop Workshop Vancouver, September 2003 utorial of the Statistical Toolkit utorial of the Statistical Toolkit http://www.ge.infn.it/geant4/analysis/TandA

Transcript of Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Page 1: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Alberto RibonCERN

Geant4Geant4

WorkshopWorkshopVancouver, September 2003

Tutorial of the Statistical ToolkitTutorial of the Statistical ToolkitTutorial of the Statistical ToolkitTutorial of the Statistical Toolkit

http://www.ge.infn.it/geant4/analysis/TandA

Page 2: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Test & AnalysisTest & Analysis is a project to develop a statistical analysis system for usage in Geant4 testing

Main application areasMain application areas

Provide tools to compare Geant4 simulation results Provide tools to compare Geant4 simulation results with reference datawith reference data

– equivalent reference distributions (for instance, regression testing)– experimental measurements– data libraries from reference distribution sources– functions deriving from theoretical calculations or from fits

physics validation

regression testing

system testing

What is the Test & Analysis project? What is the Test & Analysis project? What is the Test & Analysis project? What is the Test & Analysis project?

Page 3: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Team of Geant4 TestTeam of Geant4 TestTeam of Geant4 TestTeam of Geant4 TestThe performed tests were established with the help of: Vladimir Ivanchenko (CERN) Michel Maire (IN2P3/LAPP)

Coordinators:Maria Grazia Pia (INFN, Genova)Andreas Pfeiffer (CERN)

Realisation of tests:Pablo Cirrone (INFN Catania)Susanna Guatelli (Genova, CERN)Alfonso Mantero (Genova, ESA) Luciano Pandola (INFN, Gran Sasso)

Automatisation/production of tests:Sandra Parlati (INFN, Gran Sasso)

Statistical analysis:Stefania Donadio (INFN, Genova)Barbara Mascialino (INFN, Genova) Maria Grazia Pia (INFN, Genova) Andreas Pfeiffer (CERN) Alberto Ribon (CERN)Paolo Viarengo (IST, Genova)

Page 4: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Electromagnetic physics: Hadronic physicsG4LowE

G4StandardG4Penelope

Particle CSDA range Particle Stopping PowerTransmission coefficientBackscattering coefficientGamma Attenuation coefficient Gamma processes’ cross section Particle rangeBremmstrahlung energy spectrumMultiple scattering testEnergy deposition in absorber material etc.

Proton Bragg Peak Study of possible tests to perform

Flexibility

Plug-in

test50

Page 5: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

X-ray Attenuation Coefficient - Ge

G4 Standard

G4 LowE

NIST-XCOM

2N-L=26.3 – =23 - p=0.29

2N-S=27.9 – =23 - p=0.22

Page 6: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

CSDA Range – Al –G4LowE

Geant4-05-02

NIST-ESTAR

Geant4-05-00

Regressiontesting

Page 7: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Backscattering low energies - Au

Page 8: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Download/setup the Statistical ToolkitDownload/setup the Statistical ToolkitDownload/setup the Statistical ToolkitDownload/setup the Statistical Toolkit

http://www.ge.infn.it/geant4/analysis/HEPstatistics

Download : StatisticsTesting-V1-00-00.tgz tarball

You also need to install the following libraries:

1) GSL (GNU Scientific Library)

2) AIDA 3.0

3) Anaphe 5.0.5

Edit the script buildAll.py to set the proper GSL path

GSL_DIR=/afs/cern.ch/sw/lhcxx/specific/redhat73/gcc3.2/ PublicDomainPackages/2.0.0/

Run the script: $ ./buildAll.py

Page 9: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Statistiscal Tests availableStatistiscal Tests availableStatistiscal Tests availableStatistiscal Tests available

Currently the available statistical tests are:

1) Chi2 test (for binned distributions);

2) Kolmogorov-Smirnov test (for unbinned distributions).

For the next future:

3) Cramer-von Mises test (for both binned and unbinned distributions);

4) Anderson-Darling test (for both binned and unbinned distributions);

Page 10: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

The AIDA classes you need to knowThe AIDA classes you need to knowThe AIDA classes you need to knowThe AIDA classes you need to know

AIDA::IHistogram1D : for 1-dimensional binned distribution

AIDA::ICloud1D : for 1-dimensional unbinned distribution

AIDA::IDataPointSet : vector of IDataPoint;

IDataPoint : is a vector of IMeasurement;

IMeasurement : (value, errorPlus, errorMinus).

Page 11: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

The StatisticsTesting classes you need to knowThe StatisticsTesting classes you need to knowThe StatisticsTesting classes you need to knowThe StatisticsTesting classes you need to know

namespace StatisticsTesting {

template < class Algorithm > // For instance: Chi2ComparisonAlgorithm ,

// KolmogorovSmirnovComparisonAlgorithm .

class StatisticsComparator {

public:

ComparisonResult compare (const AIDA::IDataPointSet & dps1, const AIDA::IDataPointSet & dps2);

ComparisonResult compare (const AIDA::IHistogram1D & histo1, const AIDA::IHistogram1D & histo2);

ComparisonResult compare (const AIDA::ICloud1D & cloud1, const AIDA::ICloud1D & cloud2);

...

}

class ComparisonResult {

public:

double distance( );

double quality ( );

double ndf ( );

...

}

}

Page 12: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Chi2 test between histogramsChi2 test between histogramsChi2 test between histogramsChi2 test between histograms

#Include “AIDA/AIDA.h”#include “StatisticsTesting/StatisticsComparator.h”#include “Chi2ComparisonAlgorithm.h”#include “ComparisonResult.h”

using namespace StatisticsTesting;

std::auto_ptr<AIDA::IAnalysisFactory> af( AIDA_createAnalysisFactory() );std::auto_ptr<AIDA::ITreeFactory> tf( af -> createTreeFactory() );std::auto_ptr<AIDA::ITree> tree( tf -> create() );std::auto_ptr<AIDA::IHistogramFactory> hf( af->createHistogramFactory( *tree ) );

AIDA::IHistogram1D& hA = *( hf->createHistogram1D( "A", 100, 0.0, 50.0) );AIDA::IHistogram1D& hB = *( hf->createHistogram1D( "B", 100, 0.0, 50.0) );

hA.fill( 15.7 );...hB.fill( 23.4 );...

StatisticsComparator< Chi2ComparisonAlgorithm > comparator;ComparisonResult result = comparator.compare( hA, hB );

std::cout << “ distance=“ << result.distance() << “ ndf=“ << result.ndf() << “ p-value=“ << result.quality();

Page 13: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Kolmorogov-Smirnov test between cloudsKolmorogov-Smirnov test between cloudsKolmorogov-Smirnov test between cloudsKolmorogov-Smirnov test between clouds

#Include “AIDA/AIDA.h”#include “StatisticsTesting/StatisticsComparator.h”#include “KolmogorovSmirnovComparisonAlgorithm.h"#include “ComparisonResult.h”

using namespace StatisticsTesting;

std::auto_ptr<AIDA::IAnalysisFactory> af( AIDA_createAnalysisFactory() );std::auto_ptr<AIDA::ITreeFactory> tf( af -> createTreeFactory() );std::auto_ptr<AIDA::ITree> tree( tf -> create() );std::auto_ptr<AIDA::IHistogramFactory> hf( af->createHistogramFactory( *tree ) );

AIDA::ICloud1D& cloudA = *( hf->createCloud1D( "A" ) );AIDA::ICloud1D& cloudB = *( hf->createCloud1D( "B" ) );

cloudA.fill( 15.7 );...cloudB.fill( 23.4 );...

StatisticsComparator< KolmogorovSmirnovComparisonAlgorithm > comparator;ComparisonResult result = comparator.compare( cloudA, cloudB );

std::cout << “ K-S distance=“ << result.distance() << “ p-value=“ << result.quality();

Page 14: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

Example of an XML data fileExample of an XML data fileExample of an XML data fileExample of an XML data file

<?xml version=“1.0” encoding=“ISO-8859-1”?><!DOCTYPE aida SYSTEM http://aida.freehep.org/schemas/3.0/aida.dtd ><aida version=“3.0”><implementation package=“Anaphe” version=“5.0.0”/><dataPointSet dimension=“2” name=“attenuation coefficient” path=“/” title=“attenuation coefficient in Ge”><annotation><item key=“Title” value=“attenuation coefficient in Ge”/><item key=“Name” value=“attenuation coefficient”/><item key=“Size” value=“24”/></annotation><dataPoint><measurement value=“1.000e-03”/><measurement errorMinus=“9.465e+01” errorPlus=“9.465e+01” value=“1.893e+03” /></dataPoint><measurement value=“1.500e-03/><measurement errorMinus=“1.3555e+02” errorPlus=“2.7375e+02” value=“5.475e+03”/></dataPoint>…<dataPoint><measurement value=“1.500e+01/><measurement errorMinus=“1.670e-03” errorPlus=“1.670e-03” value=“3.340e-02”/></dataPoint></dataPointSet></aida>

Page 15: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

How to write a XML data fileHow to write a XML data fileHow to write a XML data fileHow to write a XML data file

aFact = AIDA_createAnalysisFactory();treeFact = aFact -> createTreeFactory();

theTree = treeFact -> create(“test50.xml”, “xml”, false, true, “uncompress” );dataPointFactory = aFact -> createDataPointSetFactory( *theTree );particleTransmissionDataPoint = dataPointFactory -> create(“Transmission test”, 2);

particleTransmissionDataPoint -> addPoint();AIDA::IDataPoint* point = particleTransmissionDataPoint -> point( PointNumber );AIDA::IMeasurement* coordinateX = point -> coordinate( 0 );coordinateX -> setValue( primaryParticleEnergy );AIDA::IMeasurement* coordinateY = point -> coordinate( 1 );coordinateY -> setValue( TransFraction );coordinateY -> setErrorPlus( TransError );coordinateY -> setErrorMinus( TransError );

theTree -> commit();theTree -> close();

Page 16: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

How to compare XML dataPointSetsHow to compare XML dataPointSetsHow to compare XML dataPointSetsHow to compare XML dataPointSets

#Include “AIDA/AIDA.h”#include “StatisticsTesting/StatisticsComparator.h”#include “Chi2ComparisonAlgorithm.h”#include “ComparisonResult.h”

using namespace StatisticsTesting;

std::auto_ptr<AIDA::IAnalysisFactory> af( AIDA_createAnalysisFactory() );std::auto_ptr<AIDA::ITreeFactory> tf( af -> createTreeFactory() );std::auto_ptr<AIDA::ITree> tree( tf -> create() );std::auto_ptr<AIDA::IDataPointSetFactory> dpsf( af -> createDataPointSetFactory( *tree ) );std::auto_ptr<AIDA::ITree> treeXML1( tf -> create(“gamma_lowE_Ge.xml”, “xml”, true, false);std::auto_ptr<AIDA::ITree> treeXML2( tf -> create(“NIST_attenuationGamma_Ge.xml”, true, false);

AIDA::IDataPointSet & dps1 = * ( dynamic_cast<AIDA::IDataPointSet*>( treeXML1 -> find(“Gamma attenuation coefficient test” ) ) );AIDA::IDataPointSet & dps2 = * ( dynamic_cast<AIDA::IDataPointSet*>( treeXML2 -> find(“ Gamma attenuation coefficient test” ) ) );std::cout << “ title ” << dps1.title() << “ size=“ << dps1.size() << “ dimension=“ << dps1.dimension()

StatisticsComparator< Chi2ComparisonAlgorithm > comparator;ComparisonResult result = comparator.compare( dps1, dps2 );

std::cout << “ distance=“ << result.distance() << “ ndf=“ << result.ndf() << “ p-value=“ << result.quality();

Page 17: Alberto Ribon CERN Geant4Workshop Vancouver, September 2003 Tutorial of the Statistical Toolkit .

Alberto Ribon

ConclusionsConclusionsConclusionsConclusions

The Statistical Toolkit provides already some important functionalities;

It is already used in test50 for Geant4 physics tests and regression;

it is simple to install and to use; we are working on the documentation;

other statistical tests are under development;

various long-term extensions are foreseen …