Arnd Meyer (RWTH Aachen) Dec 4 th, 2003Page 1 Tevatron and DØ Status and Plans Arnd Meyer, RWTH...
-
date post
30-Jan-2016 -
Category
Documents
-
view
218 -
download
0
Transcript of Arnd Meyer (RWTH Aachen) Dec 4 th, 2003Page 1 Tevatron and DØ Status and Plans Arnd Meyer, RWTH...
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 1
Tevatron and DØ Status and Plans
Arnd Meyer, RWTH AachenDØ Germany MeetingDecember 4th, 2003
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 2
Data Taking Status
Total datasample ontape withcompletedetector
> 200pb-1
...still waitingfor the first> 200pb-1
analysis
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 3
Data Taking Status cont.
● Lab reached its goal of delivering 225pb-1 in FY03
Also for DØ: BD delivered
∫L dt = 227.7pb-1 in FY03
26pb-1 per month since May
But had to run 6 weeks longer than hoped
● Cuts into our running time next year
Long shutdown – difficult to anticipate rapid startup
Six week shutdown next summer / fall● We do not expect to get significantly more ∫L dt in FY04 than
we got this year 25% pbar tax (Recycler commissioning), studies ⇒ 233-328pb-1 in FY04
See Dave Mc Ginnis' presentation on Oct 3 ADM
FY02 FY03
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 4
FY 04 Luminosity Profile
More (2x/week) andshorter (8 hours)accelerator study periods
Studies only if >140 hrsof store time in theprevious 14 days
Higher deliveredluminosity throughimproved stacking rate
Improve stacking ratethrough shorter pbarproduction cycle time(2.4sec 1.7sec) 11.3mA/hr (FY03) 18 mA/hr (FY04)
≃ 3 months turn-onafter shutdown
(on schedule so far)
pessimistic∫L dt ≃ 233pb-1
design∫L dt ≃ 328pb-1
Will know by the end of the year if Recycler work was successful(but can benefit only in FY2005)
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 5
... and a Wishlist for End of FY04
You are here
3⋅1032cm-2s-1
2004
● STT fully commissioned● Missing pieces of CTT
fully commissioned● Taking data with rates of
2.5kHz / 1kHz / 50 Hz after L1 / L2 / L3 and 90% average efficiency
● CPS / FPS used in the trigger and for physics
● Most data quality problems are caught online
● 1-2 fewer people on shift
● Taking shifts and improving the detector is not considered a necessary evil
● We have 0.5fb-1 of good data on tape● Reco takes 1sec/event on my 2-year old desktop
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 6
Data Taking Efficiency
Shutdown
Winter '03Shutdown
The LuckyWeek
Pre-shutdownspecial runs/
studies
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 7
Efficiency (Post-Shutdown)
● See some of the improvements in the machines already; e.g. lifetime at 150GeV in the Tevatron now 16-28 hrs vs. few hours pre-shutdown (removed apertures limitations)
● Biggest problem: “messy” store terminations with large losses, quenches, and CDF losing a couple of Silicon ladders
Not bad after10 weeks ofshutdown!
≃ 8 Stores so far
Initial luminosities 0.7 – 9.0 – 8.6 – 15.9 – 22.117.9 – 21.6 – 21.9 ⋅ 1030cm-
2s-1
Factor of 2 below the best stores before the shutdown
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 8
Data Taking Efficiency (pre-SD)
● Average data taking efficiency for 2003 is 86%● Current upper limit is ≃ 95%
3-4% global front end busy
1% begin and end store transitions
<1% run transitions
● Typically in the upper 80%'s for the last six months
Since Beaune, “lost” 83.5 hours of store time (12.6% by time)
12 hours for special runs
Largest single failure (4 hrs): low airflow trips of L1CAL on July 27/28. Fan belt replaced. Symptomatic: some of the largest downtimes are one-time occurences
Failures by component (without special runs):
– SMT: 12 hours
– Muon/L1Muon: 12 hours (trips, readout errors/crashes, trigger problems)
– CAL: 9 hours (mostly BLS power supplies and hot trigger); + 5 hours L1CAL
– CFT/CTT/PS: 5 hours
– L3/DAQ/Online: 4 hours
Tracking crates readoutcollaborations' decision: L1A vs. FEB
fairly optimized, contiuouseffort to keep this low
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 9
Data Taking Efficiency cont.
● Much of the time running close to our desired efficiency to tape – 90%
● At the same time, data quality improves
Number of conditions that causes us to stop data taking (automagically or manually) is continually increasing
● Credit to many (few) dedicated people!● Large downtimes are discussed at the weekly operations meeting● Several systems marginal in terms of expert coverage: one
or no resident expert – no manpower for proactive improvements
http://www-d0.fnal.gov/runcoor/
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 10
Run II Bests
Regularly updated:
Best days by data taking efficiency
– 95.0% on June 22nd; so far 8 days with 93% or better efficiency
Best runs and days by recorded luminosity (Aug 10, 488nb-1; May 4, 1.68pb-
1)
Best stores by initial DØ luminosity (Aug 10, 4.55⋅1031cm-2s-1)
http://www-d0.fnal.gov/runcoor/
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 11
A “Typical” Store
Wobbly L1 rates (CAL)Initial Lum. 3.9⋅1031cm-2s-1
Store lost (quench)5-3% L1 Busy
Present max. rate guidelines:Level 1 1.4 kHz FEB < 5% (5-10% headroom to accountLevel 2 800 Hz Muon r/o for rate fluctuations)Level 3 50 Hz Offline (30% room)
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 12
“Typical” Store (Post-Shutdown)
7% L1 Busy at 1kHz15% at 1.5kHz
File transfers to FCC
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 13
Control Room Shifts
● It is a burden on the collaboration to fill that many shifts (and schedule them!)
The shifter duty is 7 shifts / 6 months per person on masthead
Only about 1 shift per month per person (on average!)
● Calorimeter and Muon shifts consolidated into CalMuon shifts since June
Rocky at times, but overall OK
Cost of additional training offset by savings in total number of shifts
There is more training involved – took some time to be realized by “old” shifters
● Next natural choice for merging is SMT / CFT – will require initiative from detector groups (clear instructions, simplify, automate)
● More than a third of the collaboration have not yet taken a single shift in 2003
The fact that we are collecting data with high efficiency is to a large partdue to the presence of 5 – 6 well trained people in the control room
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 14
Data Quality & Global Monitoring
● Online data quality monitoring consists of three (four) parts
Significant Event System (“Slow Control”)
– Catches an increasing number of hard- and software failures
– In many cases pauses the run to ensure consistent data quality
– Working very well, could use additional experts guidance
DAQ Artificial Intelligence
– Notifies shifter of abnormal conditions (global rate fluctuations, BOT trigger rate, ...)
– Automagically fixes certain problems (SCLinit), e.g. sync. problems in L2
Sub-detector monitoring examines
– Many expert-level plots (but experts are not generally on shift)
Global Monitoring
– Trigger rates, Trigger Examine, Vertex Examine, Physics Examine
● Global Monitoring has great potential, but there are many issues – examples follow
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 15
– If we can't fill all shifts, need to think about merging with Captain's and other shifters' duties
Global Monitoring cont.
● Technical Issues
During the transition from trigger list v11 to v12, ran for weeks with wrong/bad reference plots (different triggers, then rapidly changing prescales)
PhysEx uses random sample – should be based on certain triggers
Low statistics (slow reconstruction)
● Psychological Issues Lack of interaction between detector
shifters and GM
– GM detects feature in Gtrack phi distribution – SMT shifter cannot correlate with his occupancy plots
– Need effort from all detector groups to bring their expertise into GM plots (only Muon group has done this so far)
● Organizational Issues Shifts not being filled (e.g. 8 in August)
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 16
DQ & GM cont.
● LmTrigger urgently needs improvement
For example averaging over different time periods, uncertainties, luminosity dependence of trigger cross sections, ...
Extremely important tool to identify problems quickly
● Overall, somewhat slow progress (remember Beaune?)● If we want to continue reducing the number of shifters in the control
room, GM needs a major effort (time, people, attention) From the collaboration – great task for groups new to DØ Automation should be the goal Need to catch all major problems online
● Up to one third of the data is thrown away in the analysis stage – sad!
Everybody who discards data through “bad” or “good” run lists should
make that extra step and think about how to catch the problems earlier!
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 17
“Summer” Shutdown
● Successful 10 week shutdown (Sep 8 – Nov 17)
7-8 weeks for experiments, 2-3 additional weeks with limited access
● Four scheduled power outages, a couple of unscheduled ones● 24x7 DAQ shifts and day shift Captains – thankless task! (“Good God,
please, someone, if you see me in the parking lot, run me over. Kill me. I am so bored to death.” - anonymous DAQ shifter)
● Major D0 goals for the shutdown
Improve reliability: reduce access time, periods with incomplete detector etc.
Improve quality of the data: reduce calorimeter noise, repair Silicon HDI's
● Some major accelerator tasks
Recycler vacuum improvements, bakeout
Tevatron alignment work, installation of Tev alignment network
Replace rotting magnet stands
Improve some aperture limitations (Tev, transfer lines), upgrade instrumentation
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 18
Post Shutdown Status
● Access went smoothly overall – no accidents, on schedule, great support
Detector groups together with mechanical and electrical support groups have developed detailed job lists including detector opening and closing, allocation of manpower resources from detector groups and support teams, survey as needed during major moves
● First store on November 22nd, as scheduled Took data within 3 minutes of first 36x36 store
Quality data taking established with 2nd 36x36 store
● Single days with about 90% data taking efficiency, already many runs with >94% efficiency
Biggest downtimes:– Solenoid protection electronics failure (~4 hours)– MCH ↔ FCH switch failure (~2 hours)
● Comprehensive reviews of online and offline quality of the data taken after the shutdown
December 12th, 15th, 19th
Identify more problems much earlier than at the “Bad Run List” level
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 19
Silicon Status
CurrentDose
● Cancellation of Silicon replacement means we must plan to operate current detector for the life of the experiment
Layer 0 (which likely will be a part of the rebaselined Run II upgrade) without additional hits from the outer SMT layers is of little use
Have to evaluate what steps can be taken to increase chances of the detector's reliable operation long term
● Bias scans (August): measure depletion one layer at a time
Use tracks from CFT and other SMT layers to determine cluster charge as a function of HV
Runs with HV varied between 0% and 100% in 10% steps
SMT in full readout (no sparsification)
Average over ladders (statistics)
Results confirm expectation, but with large uncertainties
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 20
Silicon Status
● Main task during shutdown: repairs of failed HDI's / electronics Before shutdown: 136 disabled (up from “irreducible” 84 just after Jan
shutdown)
12 are “definitely repaired” (2 weeks of 2 shifts/day with 4 people)
59 are “unstable”
● 112 HDI's are currently not powered
50 Ladders (11.6%)
30 F-wedges (10.4%)
32 H-wedges (16.7%)
● A few HDI's failed when magnet was energized
● All but 2-3 of the enabled HDI's participate in track reconstruction
● SMT is operating stably
1st storeafter
shutdown
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 21
Central Fiber Tracker
● Shutdown tasks Maintenance of LVPS's and installing upgraded LVPS (better connectors),
maintenance of the VLPC He cooling system
Major job: modification of AFE boards to remove unused SVX inputs from the readout Reduce data size and DAQ deadtime
● Known issues: channels corresponding to 10 SVX chips are not operational
Swapped AFE boards – problems stays
Found that problem appeared after last power outage with warm up of cryostat serious problem!
● Reconstruction is still using old (wrong) CFT maps – makes offline data quality checks hard
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 22
Calorimeter
● Replaced all large cooling fans for preamplifier cooling during shutdown
● Studies of calorimeter noise – access priority given to “noise task force”
Characterize noises: 10MHz, 14.3MHz (RF/4), “Ring of Fire”, ...
Controlled power-up after power outages to identify sources
Improve grounding
● “Ring of Fire” / “Welder Noise” Sudden burst of triggers/events
Appears when a welder is triggered inDAB3
Other unidentified sources
● Entry in the cryostat identified Noise disappears almost entirely when
temperature monitoring cables aredisconnected
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 23
Calorimeter Noise
● RF/4 noise appeared when muon chambers switched back on Unstable
Not really observed in data
● 10MHz noise from SMT sequencers Went around with a radio tuned to
“WSMT” (10MHz) to identify sources
● Series of grounding tests (shutdown) Disconnect AC, safety ground, phone, etc.
Visual inspection: found ≃10 contacts
Attach current controlled power supply,slowly increase current up to 50A, and
look for heat sources
Found (and fixed) a few more problems
Improved grounding reduces “welder noise” with temperature cables attached by about a factor of 2
● Does all the work pay off? Robert: “Looks better than ever before!”
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 24
(New) Calorimeter Monitoring
Occupancy/energy views to catch hot/cold zones2nd store
aftershutdown
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 25
Muon Systems
Shutdown Tasks:
● Forward Muon
First-time access to A layer forward muon tracker (8-12 hours for opening and closing) completed: replacement of preamplifiers, gas leaks, gas monitors; C layer repairs in progress
Number of non-working channels now 0.15% (trigger counters), 0.5% (drift tubes)
● Central Muon Installed extra trigger counters under the detector – running into a few snags
(tight clearance on east side) but no show-stoppers
Installed 144 remote power cycle relays for front-end electronics
Pulled a couple of wires drawing moderate to high currents for investigation
Installed Power PC's in the remaining muon readout crates that had 68k's
One PDT problem will require more than 4 hours access to repair
● Muon systems are collecting physics quality data – no known serious issues
The infamous bottom hole
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 26
Luminosity System & FPD
● Luminosity system Cable work in the gaps
Complete readout electronics installed (finally!). Required to reduce the embarassing 10% uncertainty on our luminosity measurement
● Forward proton detector
Installation of electronics for full system operation, all 18 pots in 6 castles
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 27
Level 2 Upgrade
● All Level 2 Alphas have been replaced with Betas! Running smoothly so far
● Too fast for PDT readout code – had to slow down temporarily until firmware corrected
● Indications that the reason for increased front-end busy lies within the Level 2 system
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 28
Silicon Track Trigger
● Reminder
The STT is part of the Level 2 Trigger System
Based on L1CTT roads, refit tracks using SMT axial hits improved pT,
, impact parameter
Reduce background, improve pT
resolution, cut on impact parameter
All Run II papers CDF has published so far are based on their equivalent (SVT)
● By default 5 STT crates in the readout after the shutdown (none before)
● Not yet in trigger
And unfortunately there's actually little point presently, with 1.4kHz L1A and 1kHz L2A
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 29
Trigger, DAQ, Online, General
● Firmware upgrades on L1CTT and L1Muon, maintenance on L1CAL
● Online & Controls Replaced 8 disks that have died over the last 2 years (4 disks for the data
logger)
Major online software upgrades – Python, Epics, VxWorks
● New Trigger Control Computer, Online IP shuffling, other upgrades/maintenance
● General detector maintenance – Air handlers, hydraulic systems, vacuum jackets, cooling water systems, ODH heads, etc.
● Old Cryo UPS replaced
● ....
No time to talk about many other shutdown jobs!
Arnd Meyer (RWTH Aachen) Dec 4th, 2003 Page 30
Conclusions
● Detector is running well (better than some people want to make us believe)
Data taking efficiency 86% for the year – “physics analysis efficiency”??
216pb-1 integrated luminosity on hands with full detector in readout
Progress in online data quality monitoring not as good as hoped for
● Shutdown went well and on schedule – a lot of work completed● Came out of the shutdown well prepared
First store on November 22nd, as scheduled
Took data within 3 minutes of first 36x36 store, quality data taking established with 2nd 36x36 store
● Major worries
L1 Bandwidth, data quality monitoring, diminishing manpower
Disconnect between data taking and analysis
Offline progress (processing/reprocessing) is slowing down physics output