Quality assurance related to system integration of the VELO detector.
Far Detector Data Quality
description
Transcript of Far Detector Data Quality
Far Detector Data Quality
Andy BlakeCambridge University
Far Detector Data Quality
Andy Blake Cambridge University
• The atmospheric analysis group has studied the Far Detector data in great detail.
- Large data set (August 2003 – February 2005).- Time spent at the Soudan mine!- Performed a complete physics analysis.
• Tools developed to examine data at many levels.- Raw data, cosmic muons etc…- Readout pathologies, hardware failures etc…- Cuts developed to select good and bad data.- But not in widespread use…!
• Next iteration of atmospheric analysis underway. - Need detailed data quality checks for this analysis. - Opportunity to develop tools for wider use…!
Far Detector
BadChannels
DCSSystems
VetoShield
• Hot/Cold• Busy• Badly calibrated
?Problems that wedon’t know about yet!
Physics Data
• Coil On• HV On
• No Holes!
?
GLOBAL DATA QUALITY
CHANNEL-BY-CHANNEL QUALITY
Far Detector Data Quality
Andy Blake Cambridge University
Far Detector data quality issues roughly divide in two:
Data QualityTools
Far Detector Data Quality
Andy Blake Cambridge University
Shift Logbooks
Online MonitoringDatabase
DCS Info
Raw Data
Calibration InfoCosmic Muons
Etc.
Far Detector Data Quality
Andy Blake Cambridge University
(1) Select good physics runs. → Determine which runs were intended as physics data. → Identify causes of bad data (e.g. LI leaks etc.) → Produce list of good physics runs.
(2) Identify when detector was in a bad state. (e.g. HV trips, coil trips, veto shield holes etc.) → (DCS) database + tools to access it.
Global Data Quality:
(3) Compile record of bad detector components. (e.g. hot/cold/busy chips, bad readout, bad calibration etc.) → (HARDWARE) database + tools to access it. → Information from raw data – pass through offline framework.
Channel-by-Channel Quality:
Far Detector Run Selection
Andy Blake Cambridge University
Soudan
Database
Cambridge Fermilab
SELECTION SELECTION
Run Summary
DBU
DATA QUALITY CHECKS
Far Detector Run Selection
Andy Blake Cambridge University
Cambridge
• Entry in database.• Run types 2, 769.• ~5000 snarls.
Fermilab
• Run types 2, 769, 17153.• 60 seconds.• 1 snarl.
• Number of subruns during period August 1st 2003 → July 31st 2005 :
Cambridge Only Overlap Fermilab Only
830 11700 1360
• This represents quite a large discrepancy!
• Far Detector run selection procedure :
Run Selection Issues
Andy Blake Cambridge University
(1) Modified Runs
• Approximately 1500 subruns have “modified” bit set.
• Corresponds to several weeks live time (The Far Detector DAQ is typically left with the “modified” flag set for long periods of time).
• Typical “modified” run comments*:– Testing the E4 trigger.– Removed Pulser Boxes from LI config.– Flashing LI at 500Hz.– Switched HV cards.– “Restarting DAQ after recovery”.– “Standard physics data”.
• These runs are generally okay for use in physics analyses – use other data quality cuts to reject any bad data.
( N.B: In contrast “test” runs typically correspond to changes in readout components – so it’s probably not wise to use these in physics analyses. )
Run Selection Issues
Andy Blake Cambridge University
(2) The Database (DBUSUBRUNSUMMARY Table)
• ~1200 runs have no entry at all in the database!
– most are bad runs (e.g. the DAQ crashed before the run started) but a significant number do correspond to good physics data.
– the gaps occur before 2005 so no beam data is affected.
• ~100 subruns can be recovered by searching for gaps.
– Gap in middle of run: subrun 0 1 2 GAP 4 5 6 7
– Gap at end of run: subrun 0 1 2 3 4 5 6 GAP
• ~300 subruns are on the Fermilab list but not in the database.
• ~20 subruns are under 10 seconds according to the database, but are actually several hours long!
Run Selection Issues
Andy Blake Cambridge University
(3) Short Runs
• Cambridge List has ~600 subruns with < 5 minutes data. (Mostly from March 2004 – dynode threshold scans + timing system tests, I think that we forgot to set a snarl threshold for this month!).
• Fermilab List has ~100 subruns with < 5 minutes data.
• Generally suspicious of short runs: – total number of snarls or timeframes is sometimes a “round” number. – some runs have unusually high or low snarl rates. – some runs have been used to test new software or components. – all test/modified data before May 2004 is just labelled as “normal data”.
Run Selection Summary
• Most of the differences between the Cambridge and Fermilab run lists can easily be explained – but there are still some discrepancies.
New Data Selection
Andy Blake Cambridge University
Select runs using database: Run types 2, 769, 17153, >5 minutes data.
Select “good” physics events.
New Cambridge data selection:
Select “clean” cosmic muons.
Fill in gaps in the database!
Physics Analysis.
Timing Calibration.
Gooddata!
Physics Data
Andy Blake Cambridge University
Select “Good” Physics Events:
• Data should contain events with:
– correct trigger bit(s) set. – 10 < digits < 1000. – LI channels < 500
– dead chips < 20. – dead crates = 0
– event rate < 75 Hz
Remove high voltage trips + data with incomplete ROP mask.
Select physics events.
• These cuts have the following effects:
– remove runs where HV is down or readout is incomplete.
– remove “normal data” which isn’t actually physics data!
– remove runs with anomalous events or rates.
Remove anomalously high rates.
Physics Data
Andy Blake Cambridge University
August 1st 2003 → January 31st 2005
require less than 20 dead chips
singles < 50 Hz
Physics Data
Andy Blake Cambridge University
August 1st 2003 → January 31st 2005
require all 32 half-crates to be working!
Physics Data
Andy Blake Cambridge University
August 1st 2003 → January 31st 2005singles > 2500 Hz
300 Hz Light Injection!
Is this bad?
Physics Data
Andy Blake Cambridge University
August 1st 2003 → January 31st 2005Raw Snarls
Physics Data
Andy Blake Cambridge University
August 1st 2003 → January 31st 2005Filtered Snarls
Physics Data
Andy Blake Cambridge University
August 1st 2003 → January 31st 2005
require event rate less than 75 Hz
Filtered Snarls
remove these runs!
Cosmic Muons
Andy Blake Cambridge University
Select “Clean” Cosmic Muons:
• Data should contain events with:
– Hits in >10 planes. – Hits in >3 planes in each view. – Satisfies straight line fit (rms <1 cm).
[ typically these events occur every ~3 seconds ].
– Cosmic muon rate < 1 Hz.
• These cuts have the following effects:
– Select clean events for timing calibration.
– Remove runs with anomalous events or rates.
Remove anomalously high rates.
Select cosmic muons.
Cosmic Muons
Andy Blake Cambridge University
August 1st 2003 → January 31st 2005
Cosmic Muons
Andy Blake Cambridge University
August 1st 2003 → January 31st 2005
remove these runs!
require muon rate less than 1 Hz
Feedback from Analysis
Andy Blake Cambridge University
Please tell me if you find any bad runs!
Aside: VARC Errors
Andy Blake Cambridge University
• Far Detector readout errors recorded in “VarcErrorInTfBlocks”
• Two types of readout error are reported:
– “ETC” errors (e.g. error in ETC readout, overflow of FIFO).
– “Sparsifier” errors (e.g. corruption in FIFO, overflow of VME buffer).
• Data corruption or buffer overflows could corrupt or truncate physics events – need to be careful!
VARC Errors
Andy Blake Cambridge University
VME
VMM
VMM
VMM
VMM
VMM
VMM
ETC
ETC
ETC
ETC
ETC
ETC
READOUTPROCESSOR
Sparsifier VMEBuffers
signal
control digitalsignal
VA Readout Card
data stored in FIFOs. data stored in
VME buffers.
VARC Errors
Andy Blake Cambridge University
FIFOs BUFFERSPARSIFIER VME TRANSFER
SPARSIFIER ~250 kHz (VA chips)
Singles: ~15 kHz (detector) ~10 kHz (shield)
Light Injection: ~30 x Rate LI@300Hz ~10 kHz
Total: ~35 kHz
VME TRANSFER ~320 kHz (VA channels)
Singles: ~15 kHz (detector) ~10 kHz (shield)
Light Injection: ~300 x Rate LI@300Hz ~100 kHz
Total: ~125 kHz
Far Detector data rates should be well within readout capabilities.
VARC Errors
Andy Blake Cambridge University
100% DEAD TIME25% DEAD TIME
August 1st 2003 → April 30th 2004
VARC Errors
Andy Blake Cambridge University
August 2003
BIT 0 – “ETC FIFO has overflowed” BIT 4 – “VME buffer has overflowed”
VARC Errors
Andy Blake Cambridge University
All 259 ETCsin detector!
All 46 VARCsin detector!
August 1st 2003 → April 30th 2004
Something went horribly wrong!
VARC Errors
Andy Blake Cambridge University
Error bit maps from a sample of data:
The error bits look really crazy!
VARC Errors
Andy Blake Cambridge University
August 1st 2003 → January 31st 2005
August 1st 2003 → January 31st 2005
Sep 26th 2003 → Apr 1st 2004
VARC Errors
Andy Blake Cambridge University
? ? ?
Summary
Andy Blake Cambridge University
• Making progress towards a complete run selection. – Have analysed data from August 1st 2003 → January 31st 2005.
• Have uncovered some new issues. – The database has a large number of gaps.
– The appearance and disappearance of VARC errors.
• Things that need doing. – HV/coil status database + access tools.
– Hardware database + access tools.