D-Zero Hall Log Book for 2011 (and some of 2012) --------------------------------------------------- The most recent entries are near the beginning of this file. This file begins in January 2011. This file contains both Trigger Framework and L1 Calorimeter Trigger entries. Earlier D-Zero Hall Log Books are on the web in one of the following directories: http://www.pa.msu.edu/hep/d0/ftp/l1/framework/logs/ http://www.pa.msu.edu/hep/d0/ftp/run1/l1/inventory_logs/ ------------------------------------------------------------------------------ DATE: 5-Mar-2013 At: MSU TOPICS: Location of the ADF-2 cards As of Monday 4-Mar-2013 the ADF-2 cards are in the following places: - Today Geoff gave the 4 cards from the DAB1 Yellow Storage Rack to Michelle and Bill Badgett. They now have: D36, D37, D33 Maestro, C5 They also have the second of the 2 Wiener crates that belong to T962 ArgoNeut. - At MSU we currently have the following 6 cards: C1, C2, C3, C4, D34, D35 - Installed in the running cold DAQ480 (160) system at LAPD there are the following 10 ADF-2 cards: A21, A20, B18, B19, B20, B21, A17, A18, A19, B17 These 10 cards are in slots 5 through 14 with card A21 in slot 5 the Maestro. All 10 of these cards have their input termination resistors removed and are setup for 2.0V full scale input range. - The remaining 80 ADF-2 cards are still in the L1 Cal Trig system in MCH-1. ------------------------------------------------------------------------------ DATE: 14-June-2012 At: MSU TOPICS: Turn Off of the TFW and TCC1 The TFW and its associated equipment were turned off on 4-June-2012. After the end of the Silicon Annealing study in mid May they kept the DAQ running ZB until about 24-May-12. I do not thing that there was any event flow from May 25th until it was turned off on June 4th. ---------------------------------------------- DZero Level 1 Trigger Trigmon Display Snapshot ---------------------------------------------- Level 1 Trigger Framework Specific Triggers 24-May-2012 09:48:04 Integrat Period L1/L2FW = 61.5 s / 61.5 s L1 Accept = 1094.30 Hz/2048654710 Operational:Yes Current:Yes Triggered:Yes L1 FW Paused= 0.0 % /NowRunning L2 Bypassed:No Outstanding L1 Accept: 0 L2 Accept = 306.90 Hz/ 573594974 Last FPGA Configure = 04-Apr-2012 15:28 L2 Accept/Reject= 28.0 % / 72.0 % Last FW Initialize = 01-May-2012 10:35 L2 FW Stalled by L2 Busy = 0 % Last SCL Initialize = 5 h 31 mn 19 s Luminosity Block Num = 0x 0083 feba Last LBN Increment = 34 s Tick / Turn = 146/ 206308604 Allocated SpTrg: 4 ExpGrp: 1 GeoSect: 6 Spec| L1 | L2 Accept |And-Or|Prescl|Total|ExpGp|ExpGp| L3 | COOR|Ex Trig|Accept| | Fired| Ratio|Expos| Live|L1 Bz|Disab|Disab|Gp ---#|----Hz|----Hz|----%|-------|----Hz|------|----%|----%|----%|----%|----%|-# 0| 9.05| 9.05|100 |16.610M|1.712M|191000| 0.0| 0 | 1.3| 0 | 0 | 0 1|787.53| 0.00| 0 | 0|1.712M| 2150| 0.0| 0 | 1.3| 0 | 0 | 0 2|297.81|297.81|100 |556.45M|1.712M| 5700| 0.0| 0 | 1.3| 0 | 0 | 0 3| 0.05| 0.05|100 | 184592|1.712M|17.20M| 0.0| 0 | 1.3| 0 | 0 | 0 Level 1 Trigger Framework Geographic Sections 24-May-2012 09:48:04 Integrat Period L1/L2FW = 61.5 s / 61.5 s L1 Accept = 1094.30 Hz/2048654710 Operational:Yes Current:Yes Triggered:Yes L1 FW Paused= 0.0 % /NowRunning L2 Bypassed:No Outstanding L1 Accept: 0 L2 Accept = 306.90 Hz/ 573594974 Last FPGA Configure = 04-Apr-2012 15:28 L2 Accept/Reject= 28.0 % / 72.0 % Last FW Initialize = 01-May-2012 10:35 L2 FW Stalled by L2 Busy = 0 % Last SCL Initialize = 5 h 31 mn 19 s Luminosity Block Num = 0x 0083 feba Last LBN Increment = 34 s Tick / Turn = 146/ 206308604 Allocated SpTrg: 4 ExpGrp: 1 GeoSect: 6 Geogr| L1 | L2 Busy | L1 GS Accept | L2 GS Accept |L1|L2|SCL Sect| Busy| Raw |Delay|Cycl| |L1Acc| |L2Acc|Er|Er|Sta --#|0x|----%|----%|----%|---%|----Hz|------|----%|----Hz|------|----%|--|--|--- 10|0a| 0 | 0 | 0 |0.00|1094.3|2.048G|100 |306.93|573.2M|100.0| | | Ok 11|0b| 0 | 0 | 0 |0.00|1094.3|2.048G|100 |306.93|573.2M|100.0| | | Ok 31|1f| 1.3| 0 | 0 |0.00|1094.3|2.048G|100 |306.93|573.5M|100.0| | | Ok 32|20| 0 | 0 | 0 |0.00|1094.3|2.048G|100 |306.93|573.5M|100.0| | | Ok 118|76| 0 | 0 | 0 |0.00|1094.3|2.048G|100 |306.93|573.2M|100.0| | |Bad 127|7f| 0 | 0 | 0 |0.00|1094.3|2.048G|100 |306.94|573.5M|100.0| | | Ok Level 1 Trigger Framework And-Or Terms 24-May-2012 09:48:04 Integrat Period L1/L2FW = 61.5 s / 61.5 s L1 Accept = 1094.30 Hz/2048654710 Operational:Yes Current:Yes Triggered:Yes L1 FW Paused= 0.0 % /NowRunning L2 Bypassed:No Outstanding L1 Accept: 0 L2 Accept = 306.90 Hz/ 573594974 Last FPGA Configure = 04-Apr-2012 15:28 L2 Accept/Reject= 28.0 % / 72.0 % Last FW Initialize = 01-May-2012 10:35 L2 FW Stalled by L2 Busy = 0 % Last SCL Initialize = 5 h 31 mn 19 s Luminosity Block Num = 0x 0083 feba Last LBN Increment = 34 s Tick / Turn = 146/ 206308604 Allocated SpTrg: 4 ExpGrp: 1 GeoSect: 6 A-O|And-Or|FI|Snc A-O|And-Or|FI|Snc A-O|And-Or|FI|Snc A-O|And-Or|FI|Snc Term| Fired|FO|Err Term| Fired|FO|Err Term| Fired|FO|Err Term| Fired|FO|Err ---#|----Hz|--|--- ---#|----Hz|--|--- ---#|----Hz|--|--- ---#|----Hz|--|--- 0| 0.00|**| 64| 0.00|**| 128| 0.00|**| 192| 0.00|**| 1| 0.00|**| 65| 0.00|**| 129| 0.00|**| 193| 0.00|**| 2| 0.00|**| 66| 0.00|**| 130| 0.00|**| 194| 0.00|**| 3| 0.00|**| 67| 0.00|**| 131| 0.00|**| 195| 0.00|**| 4| 0.00|**| 68| 0.00|**| 132| 0.00|**| 196| 0.00|**| 5| 0.00|**| 69| 0.00|**| 133| 0.00|**| 197| 0.00|**| 6| 0.00|**| 70| 0.00|**| 134| 0.00|**| 198| 0.00|**| 7| 0.00|**| 71| 0.00|**| 135| 0.00|**| 199| 0.00|**| 8| 0.00|**| 72| 0.00|**| 136| 0.00|**| 200| 0.00|**| 9| 0.00|**| 73| 0.00|**| 137| 0.00|**| 201| 0.00|**| 10| 0.00|**| 74| 0.00|**| 138| 0.00|**| 202| 0.00|**| 11| 0.00|**| 75| 0.00|**| 139| 0.00|**| 203| 0.00|**| 12| 0.00|**| 76| 0.00|**| 140| 0.00|**| 204| 0.00|**| 13| 0.00|**| 77| 0.00|**| 141| 0.00|**| 205| 0.00|**| 14| 0.00|**| 78| 0.00|**| 142| 0.00|**| 206| 0.00|**| 15| 0.00|**| 79| 0.00|**| 143| 0.00|**| 207| 0.00|**| 16| 0.00|**| 80| 0.00|**| 144| 0.00|**| 208| 0.00|**| 17| 0.00|**| 81| 0.00|**| 145| 0.00|**| 209| 0.00|**| 18| 0.00|**| 82| 0.00|**| 146| 0.00|**| 210| 0.00|**| 19| 0.00|**| 83| 0.00|**| 147| 0.00|**| 211| 0.00|**| 20| 0.00|**| 84| 0.00|**| 148| 0.00|**| 212| 0.00|**| 21| 0.00|**| 85| 0.00|**| 149| 0.00|**| 213| 0.00|**| 22| 0.00|**| 86| 0.00|**| 150| 0.00|**| 214| 0.00|**| 23| 0.00|**| 87| 0.00|**| 151| 0.00|**| 215| 0.00|**| 24| 0.00|**| 88| 0.00|**| 152| 0.00|**| 216| 0.00|**| 25| 0.00|**| 89| 0.00|**| 153| 0.00|**| 217| 0.00|**| 26| 0.00|**| 90| 0.00|**| 154| 0.00|**| 218| 0.00|**| 27| 0.00|**| 91| 0.00|**| 155| 0.00|**| 219| 0.00|**| 28| 0.00|**| 92| 0.00|**| 156| 0.00|**| 220| 0.00|**| 29| 0.00|**| 93| 0.00|**| 157| 0.00|**| 221| 0.00|**| 30| 0.00|**| 94| 0.00|**| 158| 0.00|**| 222| 0.00|**| 31| 0.00|**| 95| 0.00|**| 159| 0.00|**| 223| 0.00|**| 32| 0.00|**| 96| 0.00|**| 160| 0.00|**| 224| 1.46|23| 33| 0.00|**| 97| 0.00|**| 161| 0.00|**| 225| 2.93|23| 34| 0.00|**| 98| 0.00|**| 162| 0.00|**| 226| 11.70|23| 35| 0.00|**| 99| 0.00|**| 163| 0.00|**| 227| 11.67|23| 36| 0.00|**| 100| 0.00|**| 164| 0.00|**| 228| 11.67|23| 37| 0.00|**| 101| 0.00|**| 165| 0.00|**| 229| 23.34|23| 38| 0.00|**| 102| 0.00|**| 166| 0.00|**| 230| 46.69|23| 39| 0.00|**| 103| 0.00|**| 167| 0.00|**| 231| 93.38|23| 40| 0.00|**| 104| 0.00|**| 168| 0.00|**| 232| 1.43|23| 41| 0.00|**| 105| 0.00|**| 169| 0.00|**| 233| 2.86|23| 42| 0.00|**| 106| 0.00|**| 170| 0.00|**| 234| 11.44|23| 43| 0.00|**| 107| 0.00|**| 171| 0.00|**| 235| 0.00|23| 44| 0.00|**| 108| 0.00|**| 172| 0.00|**| 236| 11.46|23| 45| 0.00|**| 109| 0.00|**| 173| 0.00|**| 237| 22.85|23| 46| 0.00|**| 110| 0.00|**| 174| 0.00|**| 238| 45.78|23| 47| 0.00|**| 111| 0.00|**| 175| 0.00|**| 239| 91.55|23| 48| 0.00|**| 112| 0.00|**| 176| 0.00|**| 240|22981.|**| 49| 0.00|**| 113| 0.00|**| 177| 0.00|**| 241|190.8k|**| 50| 0.00|**| 114| 0.00|**| 178| 0.00|**| 242|47712.|**| 51| 0.00|**| 115| 0.00|**| 179| 0.00|**| #243|1.717M|**| 52| 0.00|**| 116| 0.00|**| 180| 0.00|**| 244|811.1k|**| 53| 0.00|**| 117| 0.00|**| 181| 0.00|**| 245|1.622M|**| 54| 0.00|**| 118| 0.00|**| 182| 0.00|**| 246| 0.00|**| 55| 0.00|**| 119| 0.00|**| 183| 0.00|**| #247|21885.|**| 56| 0.00|**| 120| 0.00|**| 184| 0.00|**| 248|2188.6|**| 57| 0.00|**| 121| 0.00|**| 185| 0.00|**| 249|143.1k|**| 58| 0.00|**| 122| 0.00|**| 186| 0.00|**| 250|143.1k|**| 59| 0.00|**| 123| 0.00|**| 187| 0.00|**| 251|47712.|**| 60| 0.00|**| 124| 0.00|**| 188| 0.00|**| 252|47712.|**| 61| 0.00|**| 125| 0.00|**| 189| 0.00|**| 253|47712.|**| 62| 0.00|**| 126| 0.00|**| 190| 0.00|**| 254|47712.|**| 63| 0.00|**| 127| 0.00|**| 191| 0.00|**| #255|7.586M|**| The TFW Control Computer d0tcc1 was turned off on 13-Junw-2012. Date: Wed, 13 Jun 2012 15:26:27 -0400 To: edmunds@pa.msu.edu Subject: Trigmon Stopped Trigmon Stopped : 15:08:51 EST ------------------------------------------------------------------------------ DATE: 16:18-May-2012 At: Fermi TOPICS: Collaboration and Operation meetings, PAB meeting, Wiener Crate, Walk through With the end of the SMT annealing tests they plan to turn off the whole DAQ system by the end of May. The normal Control Room Log Book can no longer been seen from off-line machines because of a FSL 4/5 issue. From the PAB meeting the tentative date to start work on LongBo installation is June 13th. They would also like 16 NIM inputs to the LongBo data logging. I need to verify how they want to do the delay between trigger generation and the stop of circular buffer writing. I brought back the T962 VME crate that had been moved from the SciBooNE pit to PAB for use in LongBo. This is Wiener crate UEV-6023 SN 4993008 power supply UEP-6021 SN 4993063. MHC-1 looked OK but the power to the L1 Cal Trig racks was back on for some reason. I *assume* that someone turned it back on after the power outage a few weeks ago ? I left it on because maybe someone needs to on: for some monitoring or alarm state stuff to get things running, or maybe they need the water flow values to be open, and because the whole DAQ should be off in another 2 weeks. Visible fans were OK and about 89 deg in top back M124. ------------------------------------------------------------------------------ DATE: 16-MAY-12 At: MSU TOPICS: Update Spin to skip making archival copy The Alphastation that was used by Spin to make an additional and versioned copy of each file uploaded via Spin has been turned off. We are no longer actively creating content for the DZero website and the automatic backup of the department server is deemed sufficient. Spin has been updated (from version 7 to version 8) to now only connect to the department ftp server. The updated script can be found at www.pa.msu.edu/hep/d0/ftp/spin/spin.py ------------------------------------------------------------------------------ DATE: 8,9-FEB-2012 At: Fermi TOPICS: Collaboration and CalOp meetings, PAB meeting and LAPD visit, 5k Off, Bit-3 PCI end, Walk through With a phone call from Dean I turned off the 5k test stand. Turned off: BLS breaks, ADC breaks, ADC bench supply, and Preamp breakers. As requested I unplugged the Preamp 3-phase plugs from the wall. I found 2 of these 3-phase cords plugged in and 1 on the floor. The only thing that I see running is the PC - everything else now looks off. Bill had removed the Bit-3 PCI card from the spare TCC-3 aka spare L1Cal TCC. I returned it to MSU. Push for Long-Bo TPC in LAPD and/or 35t - will have the same flange, problem with Cathode HV in LAr - bubbles in LAr, flash lamp noise, cosmic muon trig, LAR1 vessel order, trapped LAr ions in the circulation current. TFW equipment is closed up and running OK. ZB and annealing runs continue. M123 upper back temperature in the range 88.4 89.0 Visible fans running. ------------------------------------------------------------------------------ DATE: 11:13-JAN-2012 At: Fermi TOPICS: Work with the folks doing the SMT Annealing study to setup a trigger, More clean up of L1 Cal Trig stuff. Returned the pager to Bill Lee. Talked with Mike Matulik and the broken L1 Cal Trig Wiener power supply from Friday 7-October-2011. Right after the failure it was taken to Prep to be returned to Wiener for Repair. It is still at Prep. All of the folks at Prep who knew what they were doing retired at the end of the run. I installed in the back of rack M102 (the air inlet rack) the 9U crate with the spare Columbia backplane and the spare TAB, GAB cards and their SCL/VME card. Also installed in the back of M102 the 9U card file with some spare VRB, G-Link Transition, and VRBC cards along with a spare SBC card and a 9U optical Bit-3 VME card. There is no change to the outside appearance of the racks. George said that I could stuff these old spare parts there. So all of the spare equipment from the sidewalk is now stored in the racks in MCH-1 as follows: Inside the back of rack M102 there is: A 9U Crate with a spare TAB/GAB Backplane containing: 2 TAB cards, 2 GAB cards, 1 VMESEL card A 9U Crate without any backplane containing: 1 SBC, 3 VRB, 1 VRBC, 3 VTM Labeled Boxes containing: Spare TAB/GAB Backplane Spare TAB/GAB Wiener Fan Tray Spare Gore LVDS Cables Spare Optical Cables for L1 Cal Inside the back of rack M112 there is: Labeled Boxes containing: Spare TAB/GAB Wiener Power Supply Spare Gore LVDS Cables The "L1 Cal Trig" cabinet from the sidewalk is now empty and has a labeled on it saying that it is available for reuse. This cabinet is currently up on the 6th floor in about the middle of the equipment room, Rm 603 The combination of the lock on this cabinet is 2005. The scope that had been in the MCH-1 L1 Cal Trig racks for checking bad Trigger Tower signals belongs to Dean. I checked with Mike Matulik and Dean to figure out who owned it. I talked to Dean and he asked me to put this scope up with his stuff on the Kotcher Platform so I have moved it there. So far the SMT Annealing study has used the CTT trigger to get cosmic tracks directed through SMT layer 0. It looks like the self triggering idea for SMT is not going to work. The S/N ratio is way off what is needed for a charge collection study as part of the overall SMT Annealing study. The current plan is to ramp down the liquid helium plant by early next week so the CFT signals will go away and these the CTT triggering will end. That may/will end needing to run the DAQ. Racks were/are closed up and running at 89 deg in M124 top back. 1075 1977 1993 3161 ------------------------------------------------------------------------------ DATE: 7:9-DEC-2011 At: Fermi TOPICS: Clean up and storage, Trigger operation for the SMT annealing study, Collaboration Meeting, PAB work All of the L1 Cal spare stuff has been collected except for the spare Wiener supply for TAB/GAB crate and the spare Gore cables. Selcuk said that he will track these down. The intent remains to stuff everything into the MCH1 racks but George has not approved this yet. I have send all the AMP cables and a pile of other stuff to recycling. Talking with folks about the setup for the SMT annealing run. It is still not clear if this is going to happen. SMT warmup starts Dec 23rd. The VLPC warm up and thus the end of CTT triggers starts January 11th. From what I can figure out so far that is probably way too short for the annealing study. Talked with Mike and still need to talk with Andy. I missed the meeting about this. Liquid starts coming out of the Calorimeter on January 5th. I ran the Long-16 card at PAB and that looks fine. The current schedule for Long-Bo is "February". They have 5 msec purity in LAPD. The racks were/are closed up. M124 top back temperature was in the range 88.6 - 88.7 degrees. Visible fans are running. ------------------------------------------------------------------------------ DATE: 20:21-Oct-2011 At: Fermi TOPICS: ADF Crate Wiener Power Supply, TAB Errors ATC Card Replaced, Move Out of Pit and Sidewalk, PAB Work and Meeting, Walk-through On Friday October 7th at about 10 AM the power supply in ADF Crate "A" failed. Joe Haley, Selcuk, and Mike Matulik replaced it with the spare. Since then Mike has shipped the failed supply back to Wiener (via Prep) for repair. This trip I put the supply from one of the T962 Wiener crates in our D-Zero Hall spares cabinet and labeled it as the spare for the L1 Cal Trig ADF and Control Crates. This T962 supply can be our spare until our actual supply is repaired. Early Friday morning the TAB connected to the bottom LVDS output from the ADF crate "D" slot 21 started showing parity errors on every sweep of the monitoring. Selcuk checked and this is TAB #0 Chip #0 Status = 0x8001. With a cables swap this moved to TAB #7 Chip #0 Status = 0x8002. This is the ATC's bottom, i.e. "C" output. I tried the normal things: pushed the cable, Initialized L1 Cal, swapped the "B" and "C" cables from ADF crate "D" slot 21. None of this fixed the errors. Swapped the passive ATC card and that stopped the errors. Pulled ATC card SN# 075. Installed ATC card SN# "Spare 3". I had previously fixed/tested ATC Spare 3. We are down to one spare ATC and it is only kind of OK. We had to move the L1 Cal Trig stuff 100% off of the sidewalk. My plan is to install the two crates full of spare cards in the racks in MCH-1, i.e. in the MCH-1 racks with just cable patch panels, e.g. M105, M110, M112. I ran out of time before getting this done and the spare card crates are on our bench/desk. Sidewalk and pit are now completely cleaned out. Dean's 2V ADFs to PAB. Tested there and ran fine. Ran with 198 nsec and 494 nsec firmware. PAD Long Bow installation meeting. MSU needs to deliver only 9 Internal cables. The length is to be determined. This is 2 meter drift with a 250 KVolt cathode supply. Also delivered a T962 Wiener crate (minus its power supply) to PAB for installation at LAPD. It is labeled and Stephen knows that it is there. It has a long fiber optics cable, a T962 VME Bit-3, and the SCLD-Substitute card in it. The racks were/are closed up. M124 top back temperature was in the range 89.8 - 90.5 degrees. Visible fans are running. ------------------------------------------------------------------------------ DATE: 29:30-Sept-2011 At: Fermi TOPICS: TFW Readout, Parts from SciBooNE, End of Beam Running, Walk-through Since the TFW Readout Crate's power supply was changed and its stalled blower disconnected from the AC line power 9 days ago, it has been running without any errors, specifically the "L3 output rate = 0" problem has not come back. Thursday afternoon I got the upper VME crate with its VME Bit-3 card and it SCLD Substitute card from the T962 rack which is now stored at SciBooNE. Mitch was the 2nd person. All of the ArgoNeut T962 material is now stored at SciBooNE. The racks were/are closed up. M124 top back temperature was in the range 88.9 degrees. Visible fans are running. ------------------------------------------------------------------------------ DATE: 19:22-Sept-2011 At: Fermi TOPICS: Work at Fermi on the "No Output from L3" problem, Andy $78 $79 SCL Status Problem, Walk-through Summary of Monday's work on the TFW readout problem: Pulled the SBC 1/2" out of its backplane connectors and then re-inserted it. We still saw errors. Used Philippe's reset one VRB at a time scripts. Tried resetting VRBs 1 through 9 and then at the next failure 9 through 1. In both cases, as a given VRB is reset it then starts to send good data in its part of the overall TFW data block. But you need to reset all 9 of the VRBs to get all parts of the TFW data block to have good data. There is no one problem VRB that when you reset it then everything starts to look OK. Used Philippe's script to readout all registers from all 9 VRBs both when everything is running well with good data from the TFW and when the system is in the mode of sending bad TFW data. Pulled out 2 or 3 VTM modules to look for water damage. Recall that an SCL Hub-End Status Concentrator card directly above the TFW readout VTMs was clearly damaged by a water leak from the 2nd floor. We saw no obvious indication of water damage on any of the VTMs that we inspected or on the back side of the TFW readout crate's backplane. Because of the top and bottom shields on the VTMs J0 and J3 connectors it would take a more careful inspection than what was done today to confirm that there was no water damage at all on any of the 9 TFW readout VTMs. Replaced the VRBC that has been in use for the past 6 years with one that Ted and Mark brought over. This new replacement VRBC has the latest firmware that was developed for SMT that includes more conservative timing of the control bus from the VRBC to the VRBs. Upon inspection of the VRBC that was removed from the TFW readout crate Ted noticed that one of its J3 pins was bent over flat. Ted is checking on the function of that pin. Philippe verified that the upper byte of the VRB's config word always matched across all 9 VRBs for both good and bad readout events and that he sees different values in this byte for both good and bad events. Philippe circulated copies of more bad event dumps and of the 9 VRB register dumps. Summary of Tuesday's work to fix the TFW readout problem 1. M124 Rear Blower It was noticed that the rear blower in rack M124 was stalled and its motor was very hot. I do not know how long it has been stalled. It was OK a couple of weeks ago. It died without the loud bearing noise that the other failing blowers have made. This was the only original blower left in the system. We have 3 rebuilt units ready to swap in. At the end of Tuesday's store at about 17:50 the stalled rear M124 blower was disconnected from the AC line power. M124 is now running on just its front blower. This should be plenty of air flow for M124. These are big blowers 800 cfm free air, 460 cfm at 2 inches of water pressure. As planned, M102 has run for years using just one of its 2 blowers. But this is a change to how the system has been running since 1999, we know that the SBC like lots of air, so we should think about replacing it - about a 4 hour job. 2. Is the end of the TFW data on each channel there ? Ted has been studying the bad events. Bad events are short. He believes that it may be the beginning of the expected data from the TFW that is missing in the readout from each VRB channel pair. That is, perhaps what is being readout in bad events is the final 2/3 of what is normally there. Philippe is going to study the bad events to see if he sees the same pattern. If this is what is going on then I hope it is a big clue as to the cause of the bad events. Philippe has checked this and provided a comparison table. It does not look like this is what is going on. 3. Replaced the TFW readout crate power supply. When Tuesday evening's store was lost we went ahead as planned and replaced the TFW readout crate power supply. There was no rush so this work was done slowly. 38 minutes form pause run to resume run. The supply chassis that was pulled out has been running for 12 years but looks OK. Its internal fans, that can only be seen when the supply chassis is pulled out and taken apart, look fine. I installed a power supply chassis that we rebuilt a year ago to practice rebuilding all of them to get ready for 3 more years of running. Pulled out TFW power supply chassis SN# 10. It consists of bricks +3.3V SN# 96390264 and +5V -2V -4.5V SN# 96390193. Install TFW power supply chassis SN# 12. It has all new fans both in its two bricks and in the chassis. This power supply chassis had a full rebuild to be ready for 3 more years of running. Andy's SMT high rate test stand is missing its $79 SCL Status connection. It appears that at the SCL Hub-End end that this cable has been plugged into the $09 Status Concentrator input. But that may have been a setup used temporarily by some other group. In any case we will leave things as they are for one more week and then set things up correctly so that Andy can finish his final tests. The racks were/are closed up. M124 top back temperature was in the range 88.7 degrees but note that some of the SCL Hub-End status cables may have been moved slightly since the last reading (and there is now 1KW less heat in the rack). Visible fans are running. ------------------------------------------------------------------------------ DATE: 7:9-Sept-2011 At: Fermi TOPICS: Work at Fermi on the "No Output from L3" problem, L1 Cal Trig Links, Walk Starting Friday Sept 2nd we started to see 3 different problems: 1. 0x1f stops reading out and thus the rate of fully built events going into L3 goes to zero (and thus L3 output rate goes to zero). 2. 0x1f continues to readout, the rate of fully built events going into L3 remains normal, but L3 output rate goes to zero. 3. Blocks of crates, e.g. all of CFT and/or SMT go out of sync at the same time. This was happening about 2 times per shift. An SCL Init would always get things running again. By the time that I got here Bill had figured out that all 3 of these problems were really 0x1f readout related. Wednesday at about 18:00 Dean let me know that it had thrown a "type #2" error, i.e. normal rate into L3 zero out of L3. 1. I paused the TFW, got a formatted output of a recent event from the SBC, and got a raw dump of 0x2cf words from the SBC. The event is very very screwed up in the formatted dump. 2. I resumed the TFW and asked the Control Room to verify that the L3 output rate was still zero. It was zero. 3. I paused the TFW, reset just the VRBC, resumed the TFW and asked the Control Room, "Is it fixed now ?" They said No - still zero out of L3. Interesting note, before I clicked on Reset VRBC - I noticed that one of its Blue LEDs was stuck ON. With no triggers coming in I don't think that it is normal for a VRBC to have either of its Blue LEDs ON. I did not notice whether it was the L1 or L2 Blue LED that was stuck ON. Clicking Reset VRBC made the Blue LED turn OFF. Is it normal for either Blue LED to stick ON ?? 4. I paused the TFW, reset both the VRBC and all 9 VRBs, and resumed the TFW. The Control Room said we now have normal L3 output. We did not issue any SCL Inits. 5. Tentative Conclusion: The overall problem is with the TFW readout data and the cause is in the TFW readout crate. Wednesday at 19:15 get a second instance of full rate into L3 Zero rate out. No Blue VRBC LEDs were stuck on this time. Resetting just the VRBs did not fix the problem. Resetting both the VRBC and the VRBs did fix the problem. No SCL Inits were used. Various more tests: sometimes the Blue L2Acpt LED is on and sometimes it is not on. In all cases reloading both the VRBs and the VRBC gets the system running again. Thursday at about 12:14 The VRBC, the 9 VRBs, and the 9 VTMs have all been pulled 1/2" out of their sockets and then latched back in and pushed in. The SBC was just pushed into its socket - no movement was felt. At about 12:39 and 12:40 I captured 2 Normal events with the TRICS formatted and raw event capturer. From the Log Book DATE: 3:5-APR-2006 I forgot to log it at the time but I think this is the trip on which all the firmware in the TFW Readout Crate (the VIPA readout crate in M124) was brought up to date. That is the TFW readout was given current VRB, VRBC, and SCL Receiver firmware. I should have logged this at the time. DATE: 6:9-JUNE-2006 Friday: Philippe's theory about why the TFW_TCC dump of TFW readout data does not look right is that there are two VME address lines shorted together in the VRB-VRBC-SBC VIPA readout crate. Power down the crate and probe with the Fluke. There is a clear A8-A9 short. Pull cards one at a time until the short goes away. This indicates that the problem is on the VRBC card. It was a short between pins 36 and 37 of the VME Bus Adrs receiver chip U65 on the VRBC card (address lines A08 and A09). I removed this solder bridge and cleaned up some of the other clear soldering problems on this card. Talk with Ted Zmuda, He says that the SN# 86 VRBC that I have as a spare from the sidewalk L1 Cal Trig test is an OK one to try running in the TFW readout crate. He says that swapping VRBCs as the next step in debugging is the thing to do. Over the past couple of weeks we have seen a few ADF to TAB link errors. I forget the TAB coordinates - but the link starts from ADF Crate "D" Slot #21. Friday at about 10:30 Selcuk and I swapped the A and C connectors at this location. The cables were in-fact swapped when we found them so we just put them back in there official locations. We did not see any link errors in the first half hour or so. The racks were/are closed up. M124 top back temperature was in the range 90.0 to 90.1 All chillers are now usable and there are no cooling water disruptions planned. Visible fans are running. 1268 1760 1866 1987 2582 3264 ------------------------------------------------------------------------------ DATE: 25,26-AUG-2011 At: Fermi TOPICS: DAB Clean Out, LBNE Meeting, Walk Through LBNE Electronics Meeting. 20 points for Cold Bo DAQ System at LAPD. Clean out of the DAB High Bay has started. This includes emptying the cabinets and cleaning out all the storage racks. This is to make way for MicroBooNE (and LAr1 ?) assembly in DAB. The racks were/are closed up. M124 top back temperature was in the range 89.2 to 89.7 All chillers are now usable and there are no cooling water disruptions planned. Visible fans are running. ------------------------------------------------------------------------------ DATE: 10:12-AUG-2011 At: Fermi TOPICS: Master Clock Errors, PAB Meeting, Walk Through Twice this week we had errors from the Master Clock - early Tuesday morning Aug 9th at 00:32 and then Thursday morning Aug 11th at 05:52. In both cases the Master Clock set two error flags: TRG_CPCC_M000/STIM and TRG_CSEQ_M001/SHDER The "TRG_CPCC_M000/STIM" error flag means that the Phase Coherent Clock module (the module that receives the raw TeV RF and TeV Sync signals) was not happy with the timing of the TeV Sync signal. Note: the PCC did not set its Sync Missing error flag. The "TRG_CSEQ_M001/SHDER" error flag means that the Sequencer Module (the module that takes the cleaned up TeV RF and TeV Sync and makes the "Time Line" signal patterns that repeat every turn) did not see TeV Sync where it expected to see it. Additional information: - Neither Tuesday morning or Thursday morning do the CDF or Main Control Room log books indicate any TeV clock problems. - The same DAQ Shifter was doing the MCH-1 walk through both times that this happened - but he did not open M100 or poke at it. I emailed Steve Chappa and called him. I proposed that between stores I wiggle and push in the LEMO connectors on the PCC, clear any resulting errors, and then try jumping up and down un front of M100. He preferred that we do nothing at this time and wait and see if the problem comes back. The Thursday morning Master Clock error caused the SCLD in the L1 Cal Trig to get confused and its SCL Receiver did not lock back on. Selcuk and Joe came in at about 7AM. Once they saw the system, they knew what the problem was and pushed the button. Over the phone talking with the Control Room they did not know what the problem was. I don't think that the Control Room made it clear that there had been a Master Clock problem. PAB meeting about Cold DAQ in LAPD. Stephen want to move forward with this and have it running by the end of the year. The full drift time will be 1.5 msec. This setup will use the existing gray PFC cables and existing feedthrough to WRP-16 card cables. The internal readout cables will need to be replaced with longer ones. They will pass me the length. This will used a 4x gain with Cap/7 cards. Will need firmware for 2048 samples in 1.5 msec i.e. a sample every 0.732 usec. The cold electronics cards to feedthrough cables will be longer than in Bo The intent is to run the 4x gain with Cap/7 cards and the new firmware in Bo at PAB and then take it to LAPD. The racks were/are closed up. M124 top back temperature was in the range 89.0 to 89.4 All chillers are now usable and there are no cooling water disruptions planned. Visible fans are running. 1956 ------------------------------------------------------------------------------ DATE: 27:29-JULY-2011 At: Fermi TOPICS: Flat Rolling Rate Plots, Power Outages, Configuration Feature, Walk Through Wednesday morning before I left MSU the TrgMon feed from MSU to the Rolling Rate Plots got stuck during a store. I do not think that there was and email from the MSU machine about this. The part of the system that FTPs the TrgMon information to the web server was fine. The part that was stuck with the TrgMon pulling data from Fermi. If I would hit a carage return in that window then it would pull a fresh current set of data from Fermi and within one minute that new data would be on the web. But without poking the pull from Fermi part - it just sat there. I just killed and restarted that part of the system with no trouble. When I first looked at this machine it was displaying a notice to renew some security software. That display had grabbed the mouse focus and I had to click something like "remind me later" to get out of it. I believe that sometime on Thursday that Philippe found a process that sends email if this system is in trouble had itself failed and needed to be restarted. All has been fine now for a couple of days. Late Wednesday night a power supply (a ups I think) in a Tevatron safety system died and it caused the Tev to dump the store and it caused lots of things to shut down, e.g. the toroid magnet at D-Zero shutdown because it lost its permit because of this safety system failure. I think that the loss of this permit actually caused all of the accelerator chain to dump its beams. Then very early Thursday morning there was a rain and lightening storm and a site wide power outage for about 2 hours. Things came back up running from the Kautz Rd sub-station. The feed to the Main sub-station was down until Thursday at about noon. It had 3 faults. The switch back to the Main sub-station was be at 7 AM on Friday. D-Zero mostly came back OK but it is still very humid. There were major problems in the accelerator complex, e.g. some of the Tevatron is almost at RT and all of it is up to LN or above. D-Zero lost about half of its helium for the VLPCs and solenoid magnet. All of the helium that was in the Tevatron was lost. Currently beam is not expected until this coming Thursday. The detector will open Monday and Tuesday to replace failed preamp power supplies. The power down and up at 7 Friday morning appears to have gone OK for our stuff. Configuration was: 1531/0 TFW and 51/0 RM. In M122 Top there was some blinking of the slot 21 card as the card in about slot 16, 17,or 18 was configuring - but the configuration looks OK. I could not tell which slot was "cross talking" into slot 21 and I did not run it a second time to figure it out. Check this with the results last fall/winter. The racks were/are closed up. M124 top back temperature was in the range 99 to 101 after the first power outage the MCH-1 Liebert started up with forgotten settings as normal. After resetting its set points M124 back was running 88.0 to 88.7 The chillers are running 100% with none of the big TRAINs working. One of the TRAIN compressors will be replaced next week. Until then it is 100% York. Visible fans are running. 1993 ------------------------------------------------------------------------------ DATE: 14,15-JULY-2011 At: Fermi TOPICS: Work on the spare TFW TCC, 4" Fans SUN Monitor, LBNE Meeting, Walk Through The spare TFW TCC, box #4, has been running with its new power supply, new fan, new battery, removed floppy cdrom and audio card, and modified air flow for the past 4 weeks. It was running just fine when I got here. It had not booted itself. I ran shutdown and put it in the brown spares cabinet. LBNE meeting but no work at PAB. The SUN workstation 20 monitor is dying. Its green LED goes off and then starts to flash which would normally indicate that the cpu box is not running. The cpu box appears to be fine - its something in the monitor that is about ready to pack it in. It would die after 10 minutes of so of running by collapsing both vertically and horizontally so I assume that it is a power supply problem. I removed the plastic cover to get more air flow through it and that way it runs just fine so far. There are no spare 4" fans for the power supplies or for the M124 rear transition card cooling in the cabinet at Fermi. I know that I had 2 spare fans there in a large plastic anti-static sack. I need to bring some more to Fermi. The racks were/are closed up. M124 top back temperature in the range 89.8 - 91.2 I think that the chillers are now supposed to be stable. Visible fans are running. 1263 2950 ------------------------------------------------------------------------------ DATE: 15:17-JUNE-2011 At: Fermi TOPICS: Work on the spare TFW TCC, Walk Through, M123 VME Errors, PAB Work TCC spare Box #4 ran OK over the past 2 weeks with its new power supply, new fan, and removed CD-ROM, Floppy, and sound card. I shut it down, replaced its battery, and started it back up and experienced the expected loop through "setup" to set its clock before it would boot. I then ran shutdown again, let it sit 10 minutes unplugged, and started it back up to verify that everything behaved as expected. It did. TCC spare box #4 will continue running on the bench. I brought down and will leave here the new power supply, fan, and battery for TCC Box #3 - the online running TFW TCC. Summary of the un-expected power down problems and actions to date. - On 11-APR-2011 at about 6:45 AM TCC Box #4 was running as the TFW TCC and powered down with no explanation. - On 13-APR-2011 I pull TCC Box #4 out of the online running system and install spare TCC Box #3 as the running TFW TCC. - On 3-MAY-2011 at about 1:45 AM TCC Box #3 was running as the TFW TCC and powered down with no explanation. - On 4-MAY-2011 with TCC Box #3 still running, we ask George to Wiggle the power cord to the current L1TCC at both ends. Wiggle the power strip. Wiggle other cables on the back of L1TCC. Tap and shake L1TCC. No indication of distress from L1TCC in response to any of these actions. Shutdown L1TCC from the console. Unplug the cable for the spare L2TCC from the power strip. Extract the power cord and replace with a hefty new power cord provided by Mike and Mike. Plug the new hefty cord into the outlet previously occupied by the spare L2TCC. - On 11-MAY-2011 I shutdown the online TCC Box #3 to move its power plug to a completely different outlet / source of AC power. Wednesday afternoon at PAB we had the safety review walk through for the operating permit for the Cold Bo DAQ system. Without being told this had morphed into a safety review of ALL the Bo electronics (duh). There were problems with some equipment that is outside of our actual responsibility. By Thursday noon we had things cleaned up for a re-inspection and the review committee has now recommended that an operating permit be issued for Bo Cold DAQ. I think this has actually been issued now. M123 VME Errors All that I got done was to verify that: TDM cards use: 4013XL in their VME IF FPGA and 4028XL in their Main Array and Board Support FPGAs. TRM cards use: 4013L in their VME IF FPGA and 4028XL in their Main Array and Board Support FPGAs. This fits the documentation: http://www.pa.msu.edu/hep/d0/ftp/l1/framework/hardware/general/ characterization_of_run_ii_components.txt the part near the end from 22-OCT-1998. So given some PROMs we are ready to deal with the TRM cards. I have a few - maybe 5 or so. I should edit one line in the script and build the firmware for the 4013XL while the machine is still all setup and runnable in the back office. I should order some PROMs that will kind of work with one jumper wire and a new cooker. The racks were/are closed up. M124 top back temperature in the range 90.0 - 91.6 I think that the chillers are now supposed to be stable. Visible fans are running. ------------------------------------------------------------------------------ DATE: 13-JUNE-2011 At: MSU TOPICS: Concerns about VME IO interface FPGAs in M123-Middle Error messages have been found recently in the Trics logfiles recorded during initialization: > I$ Initializing TDM for L1FW SpTrg #000:015 > I$ @Master#1/Slave#1/Slot#11 > W$Warning: CFpgaVmeInt::Initialize Found VME BusError Flag Set This means that the "VME error flag" that the VME interface FPGA maintains was found set before the card was initialized (which clears this flag again). Only this one TDM card has shown this problem in the recent past, *BUT* this oddity was one of the occasional symptoms detected in the M122-Top Crate where some of the VME interface FPGAs would lock up and polute all VME IOs to all cards in the crate. This has happened on 15-Feb-2011 17:10 07-May-2011 13:52 28-May-2011 22:07 02-Jun-2011 11:00 Reminder of what cards are in this crate: > MIDDLE CRATE - 9U Custom Crate for L1 FW Spec Trig Fired and Geo Sect L1 Accept > ----------------------------------------------------------------------------- > > SLOT# > > 1. TOM and Vertical Interconnect Slave for Master #1 / Slave #1 > > 2. open > > 3. Pass-Through for Expos Group, Front-End Busy, Global Disable > > 4. TDM L1 Specific Trigger Fired #127:112 > 5. TDM L1 Specific Trigger Fired #111:96 > 6. TDM L1 Specific Trigger Fired #95:80 > 7. TDM L1 Specific Trigger Fired #79:64 > 8. TDM L1 Specific Trigger Fired #63:48 > 9. TDM L1 Specific Trigger Fired #47:32 > 10. TDM L1 Specific Trigger Fired #31:16 > 11. TDM L1 Specific Trigger Fired #15:0 > > 12. Pass-Through for Specific Trigger Fired > > 13. open > > 14. FOM L1 Accept for Geographic Section #127:64 > 15. FOM L1 Accept for Geographic Section #63:0 > 16. FOM++ Skip Next, L1 Qualifiers, L3 Transfer Number > > 17. TRM FIFO to L2 FW for L1 Specific Trigger Fired #127:64 > 18. TRM FIFO to L2 FW for L1 Specific Trigger Fired #63:0 > > 19. Pass-Through for L2FW L1 Spec Trigger Fired Mask > 20. Pass-Through for FOM++ > 21. Pass-Through for L1 Accepts and RAY > If VME IOs start hanging in the M123-Middle crate, the obvious and immediate symptom that shifters would notice would be corrupted specific trigger rates. We expect that a VME SYSRESET in that crate would restore normal operation. Some additional confusion would probably occur e.g. if the DAQEXP/COOR tried disabling specific triggers while the VME bus was poluted by data from and old VME IO hanging on the bus. In this case, an initialization might be required to restore sanity. Reminder of control values in short IO space to reset Crates: > //////////////////////////////////////////////////////////////////// > // Access to "Master's IO space" to reset remote crate > //////////////////////////////////////////////////////////////////// > // > // -------+-------|-------+-------|-------+-------|-------+------- > // A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A > // 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 > // 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 > // -------+-------|-------+-------|-------+-------|-------+------- > // |Rck| |Crt| > // |1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1|0 0 0 1|Num|0 0 0 0 0 1 X|Num|0 > // \-----------------------------/ \-/ | \-/ > // Upper Half must be 0xffff | | | > // even though we do a short IO | | | > // | | | > // Rack Number i.e. V.I. Master Address <-----/ | | > // 0 = Rack M122 (Per Bunch Scalers + L2FW) | | > // 1 = Rack M123 (L1FW + L2 Monit Scalers) | | > // 2 = Rack M124 (SCL + L1FW Readout) | | > // 3 = Rack M101 (L1CT Readout) | | > // | | > // 0 = Interrupt <-----/ | > // 1 = Reset | > // | > // Crate Number i.e. V.I. Slave Address <-----/ > // 0 = Top Crate > // 1 = Middle Crate > // 2 = Bottom Crate > // 3 = reserved > // > // e.g. Reset L1FW M122 Top Crate (Master 0, Slave 0) = VME write to 0xffff1018 > // e.g. Reset L1FW M122 Mid Crate (Master 0, Slave 1) = VME write to 0xffff101A To Reset the M123-Middle crate we can use Trics' VME Access dialog. Here is a reminder of what this dialog window looks like > http://www.pa.msu.edu/hep/d0/ftp/tcc/trics_ii/trics_dialog_03_io_raw_vme.gif VME Addr 0xffff141A **** to reset M123-Mid **** Data Out 0 (or any value) Repeat Cnt 1 (this is the default setting) Bit3 Adapt 0 (default) A16 **** Short IO space **** D16 (default) Supervisory (default) Data (default) Bit3 API (default) Read After Write (default) Invert Data Out (default) and click "Write" I don't remember if the "Supervisory" and "Data" mode were both required for Short IO space (I don't think "Data" is relevant), but these settings will match how Trics is talking to the Bit3 software to reset the M122 crates. ------------------------------------------------------------------------------ DATE: 1:3-JUNE-2011 At: Fermi TOPICS: Work on the spare TFW TCC, Walk Through, PAB Work The new tested fan and power supply are now in spare TFW TCC which is Box #4. The battery that I carefully wrapped in a plastic sack and brought down here with me is the "dead" battery that I pulled out of the machine by Philippe's desk when I practiced changing things. Duh. So I will need to swap batteries next trip. As planned I unplugged the floppy and CD ROM drives. They are unplugged both in the sense of no power cable and also no signal cable to the mother board. Making the appropriate BIOS settings changes was straight forward. I have also removed the "sound card" from this machine (one less thing on the PCI bus. I booted this machine twice to verify that all was normal during the start up. This machine is now cooking for 2 weeks on the bench here with a sign on it saying not to play with it. The removed power supply, fan, and sound card are being brought back to MSU. The new fan runs at full speed all the time and it pulls a lot more air through the box. I wanted to verify that the power supply fan could still find enough "input air" so that it could properly cool the power supply. It's hard to see where very much air can come into this style of Dell box. Using the kleenex test it was clear that a lot more air came out of the power supply fan when the box was opened about 1" at the back than when it was completely closed. When the box was completely closed the result of the kleenex test was consistent with no air moving through the supply. Thus to proved more input air I left the back panel "slot covers" off the 3 unused PCI slots. With those covers off and the box fully closed the kleenex test indicates that there is at least some air moving through the power supply. I dug into how these power supplies work just a little bit. This is the kind of supply with the 20 pin and a separate 4 pin power connector for the mother board. - The 4 pin connector is a separate +12V supply to run the "buck" step down converter for the cpu core supply. - The only thing that is ON when the supply is OFF is 2 Amps of +5V. - The only "control" lines in the 20 pin connector are: "Power ON" Pull this input to the power supply to Gnd to turn ON the supply. "Power OK" This logic signal output from the supply indicates that all the supply's power outputs are running OK. I do not yet know the sign or signal levels of this logic signal. PAB, Bo was re-filled, signals were good, more good runs for analysis were taken. The racks were/are closed up. M124 top back temperature in the range 90.1 - 90.6 There was/is more work scheduled for the chillers this coming week. Visible fans are running. The Workshop is June 13:17. 1075 1268 1791 1824 3082 ------------------------------------------------------------------------------ DATE: 11:13-MAY-2011 At: Fermi TOPICS: Change the L1TCC Power Outlet and boot it, Work at PAB, Walk Through Wednesday at about 16:45 I shutdown the TFW TCC and changed where it is plugged in. It now gets power from the outlet strip on top of M101. This outlet strip gets power from the contactor box for rack M101 so the TFW TCC is now on a completely different AC power feed. TFW TCC started back up with no problem. As expected, after being plugged in I needed to push its front panel power button. I put labels on the back of L1TCC and L2TCC just in case we have to talk with some one again at 3AM who is not experienced with these machines. I put a label on the front of L1TCC that points to its front panel Green Power LED and Power Switch. If L1TCC shuts off again, it may be worth while to ask the person who is looking at it to look at the back of the machine, right above where the mouse and keyboard cables plug in, and see if any of those LEDs are ON. There are ?? LEDs in a row and I think that their pattern or color may tell you something about the basic state of the machine. Worked at PAB. First muon tracks with new DAQ. Ready for safety review. The racks were/are closed up. M124 top back temperature in the range 90.1 - 92.7 There was/is more work being done on the Trane cooling water chillers. Visible fans are running. ------------------------------------------------------------------------------ DATE: 3-MAY-2011 At: MSU Action at Fermi TOPICS: TFW TCC Turn Off The L1TCC machine shutdown during a store early Tuesday morning May 3rd. We do not know if this shutdown was caused by a problem with the machine or by a temporary loss of AC line power. This shutdown cost at least one hour of good beam time. - From the Rolling Rate Plot it looks like L1TCC died at about 1:45 AM I *think* that it died shortly after a new run was started. - The D-Zero Control Room called me at about 2:00 AM - The DAQ Shifter and/or Captain verified that the LEDs were off on the L1TCC. I asked them to push its front panel power button. It was back on by about 2:15 or 2:20 - Bill came in and got COOR and the back-end systems running again. A Physics Run was going by about 2:45 Note this is the second time that this has happened. See the log book for 11-April-2011 i.e. about 3 weeks ago. Note that this is a different machine. I swapped in the spare on 13-Apr-11. Note that again the L2TCC, running from the same power outlet strip, did not have any problems. Note that again Philippe found nothing in either the system log files, the application program log files, or the BIOS system event log to indicate why this machine shutdown. ------------------------------------------------------------------------------ DATE: 27:29-APR-2011 At: Fermi TOPICS: Work at PAB, System Checks and MCH-1 Walk Through, Record Stores, T962 Bias Voltage Filter Box The fan at the bottom of M124 that blows air into the small TFW Communications VME crate that holds the TFW Bit-3 and VI cards was not operating. I waited until between stores and then pulled things apart enough to remove a shredded plastic bag from this fan. It was operating OK two weeks ago. It is quite dirty and needs to be cleaned if we need it for more than a few more months. Two Fermilab record initial luminosity stores this week. At PAB Bo is pumped down to 5 e-6 Torr and cathode test have been completed. Thursday morning I got the T962 Wire Plane Bias Voltage Filter Box from Wide Band with help from John Voirin. The racks were/are closed up. M124 top back temperature in the range 90.3 - 92.0 There was/is more work being done on the Trane cooling water chillers. Visible fans are running. 1075 1824 1987 1993 3291 ------------------------------------------------------------------------------ DATE: 13:15-APR-2011 At: Fermi TOPICS: Swap TFW TCC Box #4 for Spare #3, New Firmware PROMs for VME FPGA in M122 Top, Why Scaler Module SM-64 and SM-65 will not Configure their VME FPGA, PAB Work Because of the failure of the TFW TCC on Monday of this week it is being replaced by its spare. Before installing the spare, Box #3, I checked the following: check that Configure_FPGAs.dcf does not try to configure L1Cal crate(s) check that Boot_Auxi.mcf includes a line with TrgMgr_FixL2Fifo_Enabled Edited the Init_Post_Auxi_L1FW.rio so that it includes the change from 23-Sept-10 that moved Tick_Selector #3 to tick 78. Set the starting Luminosity Block Number to $ 0079 8500 checked that the computer time and date were OK. Philippe had already pulled the last TRICS log file off of the machine that was being removed from service. TFW TCC swap, pull out of service Box #4, put into service Box #3. This went very fast - down time of 5 minutes or so. Tried to figure out why Box #4 failed, powered itself down, this past Monday morning. It ran OK on the bench here Wednesday afternoon, and all day Thursday and Friday. I opened it up while it was running, and pushed on and wiggled all the cable connectors. It just kept on running. It looks clean inside. Give up trying to find a visible cause of the problem. Box #4 will remain at Fermi and currently is our official Spare TFW TCC. It is labeled as such and is in the brown cabinet. It has a Bit-3 card in it and is ready to run. I also fired up the original 13 year old HP TFW TCC. It started up OK and run for 1/2 hour. It does not have a Bit-3 card in it. Later Wednesday afternoon I was given the OK to replace the VME FPGA PROMs in the M122 Top crate, the Luminosity Foreign per Bunch Scalers. The serial numbers of the scalers that were in this crate at the start of this work, starting from Slot #2, were: 65 42 38 36 37 34 33 30 32 31 2 3 4 40 35 39 7 6 8 5. Recall that scaler card SN #5 in Slot #21 already had its PROM changed some months ago. Swapping PROMs went OK right up until the end. I swapped about 6 or so at a time and then powered up the crate to verify that all VME FPGAs were configuring OK. If one of the VME FPGAs does not configure then the DTACK* LED on the TOM card in Slot #1 stays ON. After doing the last set of cards, the DTACK* LED was ON. The card whose VME FPGA would not configure was SN #65 in Slot #2. I tried a different new 1811 VME FPGA prom and card SN #65 still would not wake up its VME FPGA. I tried one of the old proms in this card and still the same problem - so I had to give up for now on this card. From Slot #2 in M122 Top I pulled Scaler Module SN #65 and installed SN #50. I did put a new 1811 VME FPGA prom into scaler module SN #50. Now from the LEDs on the TOM in Slot #1 it looks like all 20 scaler modules have their VME FPGAs waking up OK. I twice did a Configure of M122 Top. Both time 340 configures and 0 errors. I did not see any extra flashing of VME activity LEDs during either of these M122 Top configures. I then returned the system to the DAQ Shifter. Scaler Module #50 that was installed was maybe not the best of choices of the SM cards that are spares here (see its trailer sheet) but I believe that it has run OK in Physics Beam Runs before from about 26-July-2006 until about 10-Aug-2006. I have also setup Scaler Modules SN #54 and SN #55 with the new 1811 VME FPGA firmware so that they may be used as spares for M122 Mid and Top. OK, the real problem with Scaler Module SN #65 (and SN #64 from the swap on 6,7-JAN-2011 of the VME FPGA firmware PROMs in M122 Middle crate) is that they have 4013XL FPGAs in them and they are not bit stream compatible. See their Trailer Sheets. At PAB got Bo final assembly finished and closed up. There was an HV feedthrough leak so they had to pull it apart Friday, fix it, and put it back together. It is pumping down now and has passed the gross leak tests. The PAB phone number is extension 3330. The racks were/are closed up. M124 top back temperature in the range 92.3 - 92.9 Visible fans are running. 3082 ------------------------------------------------------------------------------ DATE: 11-APR-2011 At: MSU action at Fermi TOPICS: TFW TCC Power Down I was paged at about 7:45 EL time because of "data logger errors" and because the COOR to L1_TCC connection was shown as bad in some monitoring display. I asked the DAQ Shifter (Rafael) to look at the console on the front of M101. With the console selector set to TFW there was no display. Rafael said that the green power LED on the front of the L1 TCC was off. He pushed its power button once and it came back on. It booted OK and auto-started TRICS. I vnc'd to the machine and data a full Initialize and that looked clean. I started TrgMon and things looked normal. They still could not get COOR to connect with L1_TCC. They re-stared COOR and then things were OK and the Beam Physics running resumed. Yes, this was all near the beginning of a nice store. The problem appears to have started at about 6:20 Chicago time. We see that the monitoring links all dropped at that time. Luminosity and such flat lined in the Rolling Rate display. Philippe checked the log files from TRICS and the operating system. So far everything is consistent with the machine just having an uncontrolled loss of power. ------------------------------------------------------------------------------ DATE: 30,31-MAR & 1-APR 2011 At: Fermi TOPICS: -2,23 HD, PAB Bo Work, Walk Through At PAB the wire frame to preamp cables are not ready yet so that has been delayed. The cable water tests continue but no definitive result yet. They want to move out by 1 to 2 inches the WRP card file. Thus I need to bring ground copper for it. Final assembly has been set off for a week. Wednesday evening -2,23 HD had trouble. We started to see a lot of noise from it an the scope shows continuous uni-polar "bumps" of about 100 mV. The bumps are about 1 usec wide and happen about every 5 usec. They are relatively smooth and do not look like sparks. The Cal precision readout cell 2/10/5/1/9 (-3,46) has occupancy over 85% during ZB running. I excluded -2,23 HD from the L1 Cal Trig at about 9:40 PM Wednesday evening and Rafael killed 2/10/5/1/9 in the precision readout. It had continuous noise and we did not want to have trouble with the first store after being down for a week. There are more notes about this in the elog stating at about 7:30 PM Wednesday evening. Before editing the file Excluded_Trigger_Towers.msg made a copy in the same directory called Excluded_Trigger_Towers.msg_saved_30mar11 In the Pedestal run at about 1AM Thursday morning 2/10/5/1/9 still looked bad. The scope watching -2,23 HD looked OK all day long on Thursday. Between stores late Thursday afternoon the pedestal run looked OK and the scope still looked OK. At 17:45 Thursday I un-Excluded -2,23 HD. The racks were/are closed up. M124 top back temperature in the range 92.0 - 92.5 Visible fans are running. 1030 1830 2582 ------------------------------------------------------------------------------ DATE: 16:18-MAR-2011 At: Fermi TOPICS: Post Shutdown Operation, PAB Bo Work, Walk Through Since the one week March shutdown the L1 Cal Trig has been running with all TTs back in. Currently +4,11 HAD has some baseline noise but not enough to exclude it. There was an interesting GAB failed status early Friday morning: Friday, March 18, 2011 2:57:32 CDT: CALMUO/Log: 982951: Joseph Haley Major alarm L1CAL_TABGAB_M107/EVT_FAILED_GAB_STATUS -> Clear T. It seems to be an isolated incident and didn't cause any problems to data flow. This "normal" injection alarm happened during stable beam Physics running. All PAB Bo electronics is now on the mounting plates. Cables pulled back out for new readout flange seals and grinding smooth the 90 degree bend. Waiting for wire plane cables for final installation. Ran a test signal through a full 16 channel "slice". The racks were/are closed up. M124 top back temperature in the range 90.1 - 90.5 Visible fans are running. ------------------------------------------------------------------------------ DATE: 9:11-MAR-2011 At: Fermi TOPICS: SCL Hub-End Work to fix the SMT 0x62 0x67 L2_Busy, PAB, No new Firmware, L1 Cal Trig Recovery, Walk Through Working with Andreas Jung we checked the SCL Hub-End to verify that it was correctly receiving the L2-Busy on Geographic Sections 0x60 through 0x67. First we just used the test box to send known state L2-Busy signals into the SCL Hub-End and we watched the TrgMon display to see whether or not the TFW was seeing these signals. There were clearly problems in GS $62 : $67. Next we unplugged the L2-Busy output cable from this Status Concentrator card and watched its ECL output with the Fluke meter as we put known L2-Busy signals into the various G.S. connectors with the test box. The result is: Geo Sect What we see -------- ------------------------------------------------------- $60 96 Valid ECL levels can flip output with input test box $61 97 Valid ECL levels can flip output with input test box $62 98 Not Valid ECL flips between -3 V and +25 V $63 99 Not Valid ECL no flip always ?? mV $64 100 Valid ECL level no flip always -956 mV $65 101 Valid ECL level no flip always -966 mV $66 102 Not Valid ECL no flip always -31 mV $67 103 Not Valid ECL no flip always -21 mV Note that the $60-$67 Status Concentrator card is the one that was so wet during the flood of Saturday 25-APR-2009 that Geoff had to pull out Status Cable connectors, drain the water out and dry things out, before we could make all the Geographic Sections run correctly again. See the log book for 25:26-APR-2009. Thursday about noon we shutdown the SCL Hub-End to un-cable and replace the $60-$67 Status Concentrator card. Ted was here and did a lot of the work. Started work at 10:52. Un-cabled and the 4 left most concentrator cards out by 11:20. Finish inspecting for water damage and start back together at 11:24. Power on at 12:07 so serial output is running but still need to plug in the Status cables. Finish plugging in the Status cables at 12:20. The Status cables that were un-plugged for this work were: from $78-$7f 78, 7d from $78-$7f 70:76 from $78-$7f 68, 69, 6a, 6b, 6f from $78-$7f 60:67 Also had to un-plug: $13, $16, $17. Also had to move the thermometer probe. Note that the card next to the wet $60:$67 card has no links used at all, i.e. there are no links used in the $58:$5f range. Only the $60:$67 card showed any damage or any sign of being wet. It was the only card that was changed. The visible damage on the $60:$67 card was near the top by the ECL drivers for example the L1 and L2 Busy signals to TFW drivers. There was a lot of damage - not just a little white powder. There was lots of brown, yellow, red junk thick enough that it could have clearly cut through traces. Ted took the Status Concentrator card that we pulled out of $60:$67 back to his lab for show and tell. He is not going to try to fix it. We have the two known spare Status Concentrator cards in our cabinet here at D-Zero. Ted took both of them to his lab Wednesday night to verify that they powered up and had the correct firmware. I did not install the new VME I/F firmware in Luminosity M122 Top crate on this trip. Things were too busy with other shutdown work. RCs know this and it will be scheduled for another trip. At PAB the preamp mounting plates are now completely setup with all hardware. Prototype cards are on 3 plates for a test fit into the cryostat. All ground cables are made and fitted with hardware. The Internal and External readout cables are installed with a clamp that Kelly made at the bottom of the Shield Pipe. The WRP Card File is on the readout feedthrough flange and all WRP cards are installed. We should see 2 glitches in the L1 Cal Trig links. One long one on Thursday at about 11 to noon when the SCL Hub-End work was done. Then on Friday at about 11 when the Master Clock was put back to Normal to lock onto the Tevatron. On Thursday I had to push the button the the SCLD card (at the very top) to get the L1 Cal Trigs SCLD SCL Receiver to lock back on. I also had to do a L1 Cal Trig Init after the Thursday outage to get things running again. For the short bump on Friday it just made some link errors and then started running OK after that by itself. When I arrived the M123 back door was not latched closed. When I arrived the M124 Top was 95.2 Since then the thermometer probe has been moved and I now see temperatures in the range 90.4 - 90.5 Visible fans are running. ------------------------------------------------------------------------------ DATE: 23:25-FEB-2011 At: Fermi TOPICS: March Shutdown Schedule, Equipment and Action at PAB, Bit-3 Gift, Walk Through The March shutdown schedule has changed. They now need to dump the store at 6AM on Monday March 7th. Beam in machine by noon Friday with the first shot anticipated Friday evening March 11. Our TFW work is on the schedule but not yet assigned at time slot - but it will be after 8AM Tuesday and before Thursday evening. On the schedule I also see "test spare L1Cal TCC". Delivered to PAB: the WRP-16 card file now with a mounting bracket for the readout pipe flange, the 2 final WRP-16 cards, DAQ-488 software for 144 cold channel test. The internal cards have been vapor phase cleaned. Meeting to finalize card location on the Field Cage in 1,2,2,1,2,1 pattern. Made a full chain warm slice test using the proto-type cards. Cut internal and external readout cables to final length and installed the 50 pin ends. Still the bias voltage cables to do. HP Gen and tester cables back to MSU. Kelly and Cary to finish: mounting plates, mounting clamps, move ground ring up, provide ss mounting screws and rod by next trip. Walter is doing the wire frame to preamp cables. ADFs SN# D33 (Maestro), D34, and D35 were delivered from MSU and installed in the PAB Bo Wiener crate. Selcuk found and gave to me a Bit-3 card. It is the PCI end of some kind of copper link card pair. Its copper cable connector has 3 columns of 20 or 21 pins each. It has "SBS Technologies" written on it in silk-screen which sets the era that it is from. It has a memory chip mezzanine riser card on it. It says "B5221511 Rev. A" on a paper tag. It says "197618" on a bar code tag. It says "85221502 Rev 1.0" in etch. Newest date codes are early 99. For now I will take it to MSU for testing. Racks were closed up. M124 Top was between 94.8 and 95.1 Visible fans are running. ------------------------------------------------------------------------------ DATE: 9:11-FEB-2011 At: Fermi TOPICS: Collaboration meetings, March Shutdown Work, Equipment to PAB, Master Clock Problem Sent note to the RC's with request for time during the March shutdown to install the new VME I/F FPGA firmware in the M122 Top Luminosity crate. Estimated 1 1/2 hours of down time. Delivered the PFH-2, CMB-16, WRP-16 cards along with internal and external readout cables, WRP card file, power supplies, fuse panel, and DC cables to PAB. Removed the original PMB blower system and will return it to MSU. Scribed the SN#s on PFH and CMB cards over the original sharppe SN#s. These cards will be vapor phase cleaned at Fermi. The card file returns to MSU for more mounting bracket work. Master Clock glitch at about 7:30 Friday morning. Friday, February 11, 2011 7:37:04 CST: CAPTAIN/Log: 971628: Horst Wahl we have a master clock alarm cleared by pushing appropriate button in clock module (guidance actually helped!) **Comment by wahl on Friday, February 11, 2011 7:39:45 AM CST we paged lumi expert **Comment by wahl on Friday, February 11, 2011 7:44:16 AM CST lumi expert will check to mkae sure lumi recording has not been affected by the MC alarm and reset Friday, February 11, 2011 7:39:32 CST: DAQ/Log: 971629: Konstantinos A. Petridis TRG_CPCC_SMIS and TRG_CPCC_MOOO/STIM Master clock alarms paused the run. Followed the guidance and pressed the "Clear Errors Button" on the Master Clock PCC in M100 rack and resumed the run This seems to have fixed things 3mins of downtime Contacted Steve Chappa. This also happened at CDF. D-Zero cleared right away. The problem persisted at CDF for about 30 minutes. It is assumed that this was caused by a Tevatron problem with TVBS signal distribution. Our Sequencer appears to have ridden through with no problem. We did not have to reset any equipment. Just pushed "clear errors" on the PCC and the run pausing SES Alarm went away and the run resumed. It's clear that Tev Beam Sync dropped out. It could be that the Sequencer is smart enough that it only Hold if there is a TVBS without the matching signal generated in the clock. We do not appear to have had a "hold". I check with the main control room and there were no LLRF problems. Racks were closed up. M124 Top was between 94.6 and 95.2 Visible fans are running. ------------------------------------------------------------------------------ DATE: 20,21-JAN-2011 At: Fermi TOPICS: TFW Power Supplies, Spare ATC Cards TFW Power Supplies Finish the rebuild of TFW Power Supply Chassis SN# 12. It consisted of: +3.3V brick SN# 99420446 and the +5V, -2V, -4.5V brick SN# 99420439. Today put new fans into brick SN# 99420439 and put new AC fans into the chassis. The brick SN# 99420446 received new fans a couple of weeks ago. So now TFW Power Supply Chassis SN# 12 has all new fans. Note that until 16-Dec-10 that is power supply chassis ran the SCL Hub-End. See the notes in the log book from that date. TFW Power Supply Chassis SN# 12 was setup to the following targets: +5.050V +3.350V -2.110V -4.610V with about a 5 Amp load on each supply as it is set. Power supply chassis SN# 12 will remain at Fermi as a spare. The two power supply chassis that had been at Fermi as spares are: SN# 4 and SN# 11. Details about both of these supplies are in the log book entry 15:17-DEC-2010. SN# 11 will remain at Fermi as a spare. I will bring SN# 4 back to MSU for a little while so that I can run the MSU test crate and verify that the 1 Meg Byte Eproms work OK for VME I/F FPGA configuration. After that test SN# 4 will come back to Fermi. Note that today I used up the last 2 spare AC TFW fans stored at Fermi. I need to bring some more TFW AC fans from MSU to Fermi. We currently have 4 sets of DC brick fans at MSU. I should bring 2 of these to Fermi. Spare ATC Cards Including the 2 ATC cards that I had at MSU last week (and returned to Fermi on this trip) there are a total of 8 spare ATC cards that I know of. I made a careful survey of all 8 of these cards. ATC Card Serial Num. Status Notes ----------- ------ ----------------------------- Spare #2 No Good Open LVDS trace >4 Ohms and Connector P0 is not installed correctly #12 No Good Open LVDS trace >5 Ohms #53 No Good Completely Open LVDS trace #74 No Good Open LVDS trace >6 Ohms #80 No Good Completely Open LVDS trace -none- ? Was labeled, "BAD LVDS CH1" This card came out of a box labeled "Damaged Parts" and it is not completely assembled. I will study this card more next trip. #20 Good The LVDS traces look OK in the Ohmmeter test. There was an open BLS trace that I jumpered over with a wire wrap wire. Spare #3 Good The LVDS traces look OK in the Ohmmeter test. Executive Summary We now have 2 spare ATC cards that should both work OK. The spare ATC cards are labeled and stored in the brown cabinet in their normal location. Notes In this sample of only 8 cards, the LVDS traces appeared to be in two groups. Some cards had LVDS traces with about 1.0 Ohm resistance. Other cards had LVDS traces with typically about 2.0 to 2.5 Ohms resistance. For reference the resistance of one wire in the 5m long LVDS cables is just under 1.0 Ohms. The traces in the analog BLS signal section of the ATC card were typically about 0.4 Ohms resistance. Five cards have one or more open LVDS traces. One card has one open BLS Trace. The resistance testing of the traces was done with a setup that flowed about 36 mA through the trace under test and allowed accurate resistance measurement at the level of 0.1 Ohms or better. Operations Meeting: The Kautz Road sub-station will be shutdown from Monday March 7th through Friday morning March 11th. No p_bars can be made during this shutdown. So D-Zero will either run at low luminosity for much of this time or if more appropriate do maintenance work. E.G. install the M122 Top VME IF firmware EPROMs. Racks were closed up. M124 Top was between 95.2 and 95.5 ------------------------------------------------------------------------------ DATE: 6,7-JAN-2011 At: Fermi TOPICS: VME I/F Firmware Install, TFW Power Supply Work, $79 L2_Busy and Lost Buffers, L1 Cal Trig Link Errors TAB 4 Chip 8. Thursday afternoon I had a chance to replace the EPROMs in the 16 SM cards in the M122 Middle crate, i.e. the Exposure Group PBS cards. I installed new 17C512 EPROMS with the "re-built" VME Interface FPGA firmware called version 1811 in it documentation. When the cards were all put back in and the crate was turned back on it was clear that on at least one of the cards that its VME I/F FPGA had not configured. You can tell that there was a card with an un-configured VME FPGA because the Address Strobe LED on the slot #1 TOM card was ON. The hunt for the non-configuring card showed that it was the one in slot #12. This is the PBS card for Exposure Group 4 Upper labeled EG4TU. It was card serial number SM-64. I tried putting another EPROM in this card with the new "re-built" firmware and it still did not configure. I did not try putting into the card an EPROM with the old standard firmware. From slot 12 I pulled out PBS card serial number SM-64 and I put in PBS card serial number GS-48 that has an EPROM with new the "re-built" firmware. The GS-48 card that was installed had/has a tag on it that says, "Had been a replacement at Fermi in M122 Middle Slot 13". This tag is in DE hand writing and has no date on it. I tagged the SM-64 and put it in the cabinet here for now. We now need to run and see if there are any more "VME Hangs" from M122 Middle. If not then put this firmware into M122 Top. Ted Zmuda, Andreas Jung, Marvin Johnson, and Geoff are working at the SMT test stand to understand the "lost buffer" problem. They thought that they found it because there could see G.S. $79 getting L2_Acpts while it was asserting its L2_Busy. But TrgMon showed 0% L2_Busy from G.S. $79 (not 0.0%). I made a box so that we could continuously assert either of the Busy signals. The L1_Busy made it to the TFW but not the L2_Busy. Spent the morning tracing these signals (including the optical coupler at the 2nd floor window). The problem was up on the 3rd floor where some one had put a short extension cable on the 40 conductor twist flat cable that carries the status information for 2 GSs. The conductors for the L2_Busy conductors were cut. We just removed this not needed extension and then they no longer saw on the logic analyzer and L2_Acpts when they had L2_Busy asserted. They would still loose a buffer every 10 or 15 minutes when running at 2 kHz. Note that the Status cable that comes from $79 on the 3rd floor and is plugged into $79 in the TFW SCL Hub-End is labeled "78 Daniel M". We needed to find the archived TrgMon on the web data to dig into whether or not there was ever any L2_Busy from G.S. $79. Marco reminded us where to find this archive: On clued0 /work/poidog-clued0/mverzocc/TM/trigmon-Jan-2011/ On online system (only recent data) /mnt/lum/data/triggerRates/trigmon-Jan-2011 Recall special things about using signals from the SCL: Anding the L2_Acpt and L2_Rej with the L2 Period signal on the VRBC card. The SLF card Serial Link Fanout card in the Hub-End had a new firmware on it. This was changed during the time period 10-Jan-2002 to 7-Feb-2002 so that some logic was done of the L2_Acpt L2_Rej signals from the TFW before they were sent out to the Geographic Section. I forget exactly what this logic is but that is the time window to look in to learn about this. I do not know for certain that all spare SLF cards were reprogrammed with the new firmware. With the return of collisions Philippe found by looking in the L1 Cal Trig log file that we had a very low rate of parity errors on the link: 07-Jan-2011 01:41: TAB #4 Chip #8 Status = 0x8002 There were 2 of these link errors during the store Thursday evening and night. Talking with Selcuk we decided to dig into this now vs waiting and maybe having a problem next week. Note that this signal is driven by the ATC card in ADF crate "C" slot #14, i.e. the one that was replaced a month ago on 16-DEC-2010. This time the problem is the "B" output from this card. In December the problem was its "A" output. So we swapped the ADF crate "C slot 14 ATC card with the one spare that we have. - Pull out the cable #79 from the top "A" connector and now see continuous TAB 3 Chip 8 0x8120 errors - Pull out the cable #88 from the middle "B" connector and now in addition to the above see continuous TAB 4 Chip 8 0xC090 errors. - Pull out the cable #75 from the bottom "C" connector and now in addition to the above see continuous Tab 5 Chip 8 0x8040 errors. - Pull ATC SN# "Spare 3" out from the back of slot #14 and install ATC SN# "Spare 5". Plug back in the cables in the order shown above. I'm taking back to MSU both the card that we removed today "spare 3" and the card that we removed from this slot a month ago ATC SN# 053. A first pass at repair is going to be to solder the pins on the press in LVDS connectors. I returned to Fermi the 3.3 Volt supply SN# 99420446 that failed while running in TFW Power Chassis #12 powering the SCL Hub-End. This power supply now has new fans and appears to run just fine. The other brick in this power chassis (i.e. SN# 99420439) needs to get new fans and then this power chassis is ready to put back together. ------------------------------------------------------------------------------