D-Zero Hall Log Book for 2009 ------------------------------- The most recent entries are near the beginning of this file. This file begins in January 2009. This file contains both Trigger Framework and L1 Calorimeter Trigger entries. Earlier D-Zero Hall Log Books are on the web in one of the following directories: http://www.pa.msu.edu/hep/d0/ftp/l1/framework/logs/ http://www.pa.msu.edu/hep/d0/ftp/run1/l1/inventory_logs/ ------------------------------------------------------------------------------ DATE: -09 At: Fermi TOPICS: ------------------------------------------------------------------------------ DATE: 17:18-DEC-2009 At: Fermi TOPICS: Walk Through Inspection, Work on TCCs, VMESCL and L1 Cal Trig, Marvin PHN scope setup Walk through, racks closed up and all visible fans running. M124 top temperature 91.7 91.8 deg F. Long access on Friday to open both sides of the iron for Cal Preamp Power Supply replacement. Requested time to test the L2 TCC with the new drivers but time ran out before this could be done. We do not have separate sheets for all 4 "new" TCCs. I need to verify that the sheet for the TFW TCC has pw and such that cover both machines and the same for the pair of L2 machines. I did run the L2 TCC machine with the new drivers to verify that it was ready to try testing. Meeting with Mike and Selcuk about status of L1 Cal Trig. On the sidewalk The VMESCL with the SCL Receiver on it is the spare that we should use if necessary. We tested both of the VMESCL cards (using the same SCL Receiver) and they clearly behave differently. I sent a note to Ted Zmuda asking if he could stop by with a laptop with JTAG and Altera software to readback the firmware from both of the spare VMESCL cards. Depending on the result of that we may also need to readback the MCH-1 VMESCL card's firmware. Mike with contact Ted to schedule this. Marvin wants to setup automated scope monitoring for PHN checks. To make a place for this I moved the Columbia firmware machine out of the shelf in rack M105. This machine has not been used for > 2 years and Mike said that he can no longer log into it. I put this machine and its monitor, keyboard, mouse, and cables on a bench in the L1 Cal Trig area on the sidewalk. I sent a note to Mike. T962 looks OK and is running fine. 1203 1977 3082 ------------------------------------------------------------------------------ DATE: 3:4-DEC-2009 At: Fermi TOPICS: Walk Through Inspection, Work on TFW Power Chassis #11, First Try at Looking at the 0x10 Front-End Busy No Data to L2 Problem I brought down to Fermi the repaired ASTEC VS3-D1-D1-00 3.3V supply that failed on 21-July-09 while running in the Routing Master. This 3.3V module is ASTEC SN# 99420448. I reinstalled it back in TFW Power Supply Chassis Number 11 with the 5V,2V,4.5V module ASTEC SN# 99420440 (which is its original partner in that chassis). Log book entries about this failure are: 26:29-JULY-2009, 12:14-AUG-2009, and 26:28-AUG-2009 Need to bring a log 4-40 flat head screw for the front panel test point connection to the 3.3V supply. The 3.3V module looks just fine. The LED in the PFC part of the 5V,2V,4.5V module flashes Red/Green and back to Red a couple of time at AC power on before it settles on Green. From previous experience this may be an early sign of trouble - and its the PFC part that seems to fail the most. This 5V,2V,4.5V module ASTEC SN# 99420440 is the one that I put all new fans in. I have setup this TFW Power Supply Chassis #11 to the following targets: +5.050V +3.350V -2.110V -4.610V This is a pretty good compromise for the required settings in the different racks. I also operated TFW Power Supply Chassis #4 and checked its outputs. This was its scheduled once per year operation. This is a full normal series production TFW Power Supply Chassis. It's outputs are setup as: +5.065V +3.333V -2.130V -4.635V i.e. basically the same as the values that I re-derived for chassis #11 above. I also operated TFW Power Supply Chassis #1 and checked its outputs. This is the "pre-production" supply that was used at MSU in the crate on the wheel around DEC cart-rack. It's outputs are setup as #4 i.e. +5.065V +3.333V -2.130V -4.635V. This chassis is on the bottom of the stack and should be used last. MCH walk through; all externally visible fans running, doors closed up M124 back top is 91.4 deg F. First try with Selcuk and Cool Joe to look at the L1 Cal Trig problem of, 0x10 FEB and no data to L2 Cal, that happens about once per store. 1. Powered down just the Coom/Control crate and then turned it back on. 2. Selcuk and Joe did some set of steps in the cold start procedure to get get the system running again but the amber LED on the SCL Receiver on the VMESEL never came back on. I do not know what steps of the cold start procedure they did. No one knew at what step in the cold start the VMESCL should tell its SCL Receiver to try locking onto the SCL data stream. 3. Time spent looking at why TAB-GAB were not working when there clearly is no SCL timing getting to the VMESCL card and thus no SCL timing getting to the TABs and GAB. The VMESCL's SCL Receiver's amber LED was off and that clearly means that it is not locked on). Decide to swap the VMESCL for the sidewalk spare. 4. Power down Comm/Control, swap to the spare VMESCL (leaving the SCL Receivers on the cards they were attached to). Selcuk and Joe did some steps in the cold start procedure to get the system running again. (a different set of steps ?) The SCL Receiver amber LED came on with no problem. When the system was "running" the TCC display in MCH-1 showed TAB link errors (all TABs) on every monitor readout sweep. I believe that the control room monitor gui also showed errors on all TAB links. Panic that there is something wrong with the spare VMESCL from the sidewalk - maybe different firmware or something. Decide to swap back to the original VMESCL 5. Power down Comm/Control, swap to the original VMESCL (leaving the SCL Receivers on the cards they were attached to). Selcuk and Joe did some steps in the cold start procedure to get the system running again. (a different set of steps ?) The SCL Receiver amber LED came on with no problem. When the system was "running" the TCC display in MCH-1 showed no errors during the monitor readout sweeps. 6. Selcuk and Joe saw a "purple box" on the control room gui and decided there was a GAB problem. TrgMon display showed that there were L1 And-Or terms coming out of the GAB OK so it must have had something rational loaded into it. 7. Because of the "purple box" a couple of cycles of loading just the GAB and then Initializing were done. By this time the store was in. L1 Cal Trig monitor readout sweeps still looked clean in the MCH-1 display. TrgMon L1 And-Or Term rates still looked rational. As soon as the last Initialize completed - I suggested to try running. This was 11 minutes into the store. 8. It looked like over the next 4 minutes or so (next two sweeps of the Active Pedestal Control) that the pedestals were moving around quite a bit. I think that the first 5 LBNs of the store that started Thursday afternoon will be marked as bad. - I do not know what was stuck or restarted in the control room monitor gui. - During the period when the amber LED would not come on, on the original VMESCL, I plugged the SCL cable from the VMESCL into the SCLD. The SCLD was happy with this SCL cable. - Before starting any of this I checked for a loose SCL cable connection at the SCL Hub-End on the cables that run the L1 Cal Trig. Everything was fine. - We do not know or have a consistent way to start up the system ? - There seems to be some mystery about GAB sometimes needing multiple tries to start up. - No one knows when in the cold start steps the VMESCL should be alive enough to tell its SCL Receiver to lock on to the SCL data stream. Mike thinks that it should just need power and then the VMESCL state engine should be running that kicks the SCL Receiver but he is not sure about this. - We may not have learned anything about the fitness of the spare VMESCL because I *think* that a different set of steps were taken to start up the system when it was installed than during the other start up attempts Thursday afternoon. - Friday morning at the sidewalk Mike could not get the official spare VMESCL to work in the way that he thinks it should so he switched and made the other of the two VMESCL cards on the sidewalk the official spare. It does behave the way that he thinks it should but he wants to check the firmware version on the VMESCLs and see if the VMESCL needs a "master reset" before its state engine runs to kick the SCL Receiver. DAQ-480 T962 were running fine. ------------------------------------------------------------------------------ DATE: 18:20-NOV-2009 At: Fermi TOPICS: Walk Through Inspection, Work on TFW and L2 TCCs, TRICS 11.3B, L1 Cal Trig Problem Friday Early AM. I brought down 2 more computers this trip. They are: DELL box #4. This is setup as a TFW TCC. It has the 11.3B version of TRICS installed on it and ready to autostart. Version 11.3B is the version with the 15 second brief Luminosity blocks. Into this machine I put the Bit-3 pci card for the copper VME interface that I pulled out of the original HP TFW TCC. I set the Luminosity Block Number in DELL box #4 to 0x006becff which is current as of about 18:00 on 18-Nov-09. Note that DELL box #4 is the one that we noted had more dirt inside. DELL box #1. This is setup as an L2 TCC. It has the "new" type of Bit-3 drivers installed in it. Into it I put the Bit-3 pci card for the expansion box connection that I pulled out of DELL box #2. Note that DELL box #1 is the one that had its power supply fan not running for a while. This was "fixed" by taking the power supply apart and putting it back together. At about 16:30 Wednesday afternoon I was allowed to stop TRICS on the running TFW TCC (DELL box #3) so that I could start the new version of TRICS that defaults to 15 seconds between Brief Luminosity Blocks. I stopped TRICS 11.3A and started TRICS 11.3B. After starting TRICS 11.3B I asked Michelle Prewitt and she confirmed that the Luminosity system was not getting Brief Block once every 15 seconds. Friday Philippe made 11.3B the auto start version on DELL box #3, i.e. the currently running TFW TCC. Next is a long string of tests using DELL box #1, the L2 TCC with the new bit-3 drivers and the spare bit-3 PCI Expansion Box. Install just 1 bit-3 pci card in the 2nd slot in the PCI Expansion Box.. Boot (#1), log into administrator, watch a widnow come up for about 1 or 2 seconds that says that its installing the driver for the bit-3 card, then: my_computer right click --> properties --> hardware --> device manager + expand "SBS Bus Adapter, right click on "Model 618/620/dataBlizzard" select Properties, select Resources input/output range ece0-ecff memory range fbef0000-fbefffff memory range fbee0000-fbeeffff memory range f8000000-f9ffffff interrupt request 19 Shutdown Move the same bit-3 pci card to a different slot in the PCI Expansion Box, i.e. to the 1st slot. Boot (#2), log into administrator, again see a widnow come up for about 1 or 2 seconds that says that its installing the driver for the bit-3 card. Do the steps above to look at Model 618/620/dataBlizzard Properties Resources everything is the same except now interrupt request 16 Shutdown. Change nothing ! Boot (#3), log into administrator, did not see a message about installing a driver. Shutdown. Install a different bit-3 pci card in the same slot in the PCI expansion box, i.e. 1st slot. Boot (#4), log into administrator, did not see a message about installing a drive. Look at Device Manager Resources for the bit-3 and everything is the same including interrupt request 16 Shutdown Move this bit-3 pci card back to the original slot in the PCI Expansion Box, i.e. 2nd slot Boot (#5), log into administrator, did not see a message about installing a driver. Look at Device Manager Resources for the bit-3 and everything is the same except interrupt request now is 19 again Shutdown Move this bit-3 pci card to the 3rd slot in the PCI Expansion Box. Boot (#6), log into administrator, and this time see a widnow come up for about 1 or 2 seconds that says that its installing the driver for the bit-3 card. Look at Device Manager Resources for the bit-3 and everything is the same except interrupt request now is 18. Shutdown Install a different bit-3 pci card in pci expansion box 3rd slot. This is a different card than was used in any of the tests above. Boot (#7), log into administrator, did not see a message about installing a driver. Look at Device Manager Resources for the bit-3 and everything is the same including interrupt request 18. Shutdown We have seen some L1 Cal Trig problems. Control Room log book clips: Thursday, November 19, 2009 11:50:32 CST: DAQ/Log: 838923: Marcos Martins Geoff Savage - sclinit for crate 0x10 100% FEB. Friday, November 20, 2009 2:54:20 CST: DAQ/Log: 839023: Tyler Dorland unusual problem where x10 goes FEB, no alarms pop up and it is cleared with an scl init. Then there is immediately an error with x23 losing synch. I suspect a level 2 problem and have paged the expert. L1 cal has also been paged to confirm nothing is wrong with x10. Shift summary Around 2:30 or so we had a series of x10 FEBs with no alarm given. An sclinit cleared this and daqai would issue another for x23 losing sync. We paged the l1cal and l2 experts and there has not been a conclusion as to what the problem was They are investigating and monitoring now. Selcuk, Mike, and I talked during and after the operations meeting about the L1 Cal Trig problems we had early Friday AM. This note is just a quick summary of what we think we know so far. - To the shifters the problem appeared as both 0x10 going 100% L1 Busy and the L2 Cal Trig stalling. The common thing that would cause this to happen is if the L1 Cal Trig were not responding to L1 Accepts by sending its data to the 0x10 readout VRBs and to the L2 Cal Trig. - In the monitoring data on the L1 Cal Trig TCC the problem appeared as all TAB cards having trouble receiving ADF data and trouble on the GAB card receiving TAB data. A common point that could cause this is a timing problem in the TAB-GAB crate. - A "global" TAB-GAB crate timing problem could be caused by a problem on the VMESCL card or a problem with the SCL Receiver mezzanine on the VMESCL card. For example the SCL Receiver could be loosing lock on the SCL data stream. (The VMESCL card gives timing and VME communications to all cards in the TAB-GAB crate.) - We don't think that this was likely a timing problem in the ADF crates because even if the ADF crates were not sending rational data to the TABs, the TAB-GAB crate would respond to L1 Accepts. That is, we do not think that it was a problem on the SCLD card or with the SCL Receiver on the SCLD card. (The SCLD card sends timing information to the 4 ADF crates.) - Mike verified that, if the SCL Receiver on the VMESCL card lost lock on the SCL data stream, then the VMESCL card would automatically tell its SCL Receiver to try again to lock onto the SCL data stream. - Mike verified that there is not a latched monitoring bit to indicate that the VMESCL card has had to tell its SCL Receiver to try locking on again. Because the SCL Receiver's re-lock process is deterministic and fast - TCC's once per 5 seconds monitor data reads most likely would not see this happen. - In MCH-1 we noticed that the SCL coax to the VMESCL card was sticking out into the aisle way some what. We moved this cable back into the rack's cable tray and tied it down. - We verified that the spare VMESCL card is out on the sidewalk and that it has an SCL Receiver mezzanine installed on it. - Another possibility that could cause a "global" problem in the TAB-GAB crate is a fast power supply glitch. This may not show up in the power supply monitoring. - If there is another period of many problems of this same type one could try watching at the amber LED on the SCL Receiver mezzanine on the VMESCL card. If this is an SCL problem then you may be able to see the ambler LED go out for a short time and then quickly come back on. The VMESCL card is the right most card in the Communications/Control crate in M108. The SCL Receiver mezzanine is near the back of the VMESCL and kind of hard to see. - The SCL_Inits that were needed last night to get things going again were to clear out all the buffers in all the DAQ system crates so that the overall D-Zero DAQ system could start back up in sync on the same event. - Philippe verified that ITC connects and disconnects from the L1 Cal Trig TCC at about the same time were from the TT Monitoring Gui program running in the Control Room. 1075 1133 1203 ------------------------------------------------------------------------------ DATE: 4:6-NOV-2009 At: Fermi TOPICS: Walk Through Inspection, Work on TFW and L2 TCCs, M124 Door, Bit-3 618 pic end to Bill, Investigate noise in TTs +5,23 EM +6,23 EM, Trigger Rate Undulations Checked the M124 SCL Hub-End top back temperature when I got here and it was 94.1 deg F. I replaced the battery and it was 93.7. The old battery was 1.3V and had a rotted case (but the display was still pretty crisp). I brought down the stuff to put vents in the top of the M124 rear door. Rolando Flores cut the holes in the back of the door and bolted in the perf metal that I brought down. One must be careful that the new metal edges associated with these vents do not cause trouble of the SCL Status cables. It looks fine but you just need to pay attention. There is air flowing out of the new vents. Now the temperature is about 91.5 deg F. Thursday afternoon Mike and I got a chance to swap in both the replacement TFW TCC and the replacement L2 TCC (the Dell machines that were brought down last trip). - We installed the replacement TFW TCC and it seems to work just fine. We triple checked that the LBN was OK. The LBN "seed" that we put on the replacement TFW TCC was 0x006b4cba. Note that the COOR log book entries are in decimal, e.g. 7,031,994 The old original TFW TCC is labeled and is in the cabinet. Thursday evening we started a new store running on the replacement TFW TCC and so far no problems. If things continue OK then I should bring the 2nd "new" spare Dell TFW TCC down to Fermi. On the newly installed replacement TFW TCC I checked and it does have the /DFC/ .dci files and the /D0_Config/Boot_Auxi.mcf file that indicate that it is current through the latest timing change on 14- Dec-06. - We installed the replacement L2 TCC but it crashes when booting if it is plugged into the powered up Bit-3 PCI Expansion box. It does a blue screen and says, "KMODE Exception Not Handled". We returned to the original L2 TCC. Mike had the nice idea of associating all the Ethernet hardware addresses that might be used for either TCC with the Internet address of that machine. Bill Lee says that is easy to do and thus there will not be any address problem if we have to swap a machine at 4AM. To run the M125 rack (on top of the air conditioner) with the stand up replacement Del TFW TCC and the lay flat pizza box original L2 TCC I had to pull out the logic analyzer that lets us look at the SCL output. That may be OK for now but it's stupid to leave things this way were in an emergency we can not see out output "product". Being able to see the SCL stream has been very important to fixing some types of problems in the past. On the bench, working with the replacement L2 TCC and the spare PCI Expansion Box, the TCC booted just fine when plugged into the empty but powered up PCI Expansion Box. Mike and I then put six Bit-3 model 618 (pci end for the optical VME interface) into the PCI Expansion Box and booted the TCC. It booted just fine and gave the expected message, "You do not have sufficient security privileges to install devices ...", so I know that it saw the one 618 card OK. 4 of the borrowed 618s for this test came from the L2 test stand, 1 from the L1 Cal Trig sidewalk test stand, and 1 is my official spare from the spares cabinet. We even tried plugging two at a time the 618s into crates in the L2 test stand and the replacement Dell L2 TCC was still happy to boot. So far we can not reproduce the blue screen crash on the bench even with 6 618s in the pci expansion box. Philippe reports that it took special stuff in the BIOS of the original L2 TCC to make it work. Notes: The Bit-3 card in the replacement L2 TCC that was tried out today is the one that was found stored in the spare PCI Expansion box. This host computer card is Bit-3 P/N 85224036 Rev. A SN# 182523. The Bit-3 card that was used in the spare L2 TCC that ran for about 1 week in early October and dropped connections with COOR was the card pulled out of the original L2 TCC. This card was put back in the original L2 TCC when it was re-installed a couple of weeks ago. I do not have its numbers. The spare PCI Expansion box itself is Bit-3 P/N 5421140 Rev A SN# 187996. It's not clear to me what the next step is but it's been 40 days and we still do not have a working spare L2 TCC. We need to get this taken care of without thrashing the online system. I brought down to Fermi the pci end of a Bit-3 618 fiber optic VME interface and gave it to Bill Lee Wednesday afternoon. This is for the spare L1 Cal Trig TCC that Bill will start putting together. This 618 pci card said T962 on it and it came from the top of the short test rack in 1200C. The TT Monitor Gui had started to show a bigger than normal RMS for +5,23 EM over the past couple of days. Starting Wednesday afternoon I watched +5,23 EM and +6,23 EM on the scope. Most of the time on +5,23 EM I see a "bump" about once every 12 usec. Each bump first swings smoothly in the negative energy direction about 75 mV pk (per side) for 2 usec and then it swings in the positive energy direction about 40 mV pk for about 3 usec. The swing in the positive energy direction often has 3 or 4 undulations in it. At other times mixed in with these bumps there is faster ringing with a period of about 850 nsec. At time the 12 usec spaced bumps are very stable. Over Thursday and Friday the amplitude grew by about 25%. Most of the time on +6,23 EM I see an almost constant sin wave looking pattern 50 mVpp with a 475 nsec period. I checked with Dean to see if there are any resistors still out of the BLS summer hybrids for these two TTs. He replied: "All the summer resistors are in both 5,23 and 6,23 (there used to be 3 resistors removed, two in 6 and one in 5 which were removed). The chances of having trigger issues on two adjacient BLS card (5,23 and 6,23) is unlikely, but we definitely should check." The big issue here is that this is exactly where we would see PHN. The other big triggering issue is the undulations in trigger rate with the low luminosity prescale sets that have shown up recently. It is likely that this has been masked for perhaps the past month by the high rate calibration triggers that have run during the second half of each store for the past month or so and just stopped this week. These undulations are very clear in the Friday afternoon store. It looks like a 20 minute period. Is the Active Pedestal Control tracking it ? Is there anything in the Cal HV current plots ? 1075 2014 ------------------------------------------------------------------------------ DATE: 21:23-OCT-2009 At: Fermi TOPICS: Spare TCC work, Walk Through Inspection, Brought down the first 2 of the 4 "new" spare TCC computers that Philippe has put together - one "new" spare TFW TCC and one "new" spare L2 TCC. I put the spare pci card for a Bit-3 617 copper pci-vme card pair into the "Dell-W2K d0tcc1 spare" (the Dell #3 box that I brought down this week). This pci card is P/N 85221511 Rev A S/N ? 186293. This is the pci card from the spare Bit-3 617 card pair that has been in the storage cabinet for the past 5 years or so. These "Dell-W2K" boxes have 4 pci slots. The sound card is in the first slot. I skipped a slot and then installed the pci card for the Bit-3 617. So there is an empty slot on each side of the Bit-3 pci card. This makes "Dell-W2K" box #3 the "primary" spare TFW TCC. At Fermi booted the "Dell-W2K d0tcc1 spare" into the administrator account and went through the steps so that the "found new hardware wizard" did not actually do anything. Then let it restart and auto-login to the trigger account and auto-start TRICS. That all looks OK. We also started TrgMonII and that looks OK. Run shutdown and put the machine in the cabinet. The HP "Workstation xw4200" that was brought down here on 30-Sept-09 as a L1/L2 spare and then last trip morphed into an L1 spare is coming back to MSU with me. I put the spare "host card" for the Bit-3 pci expansion box into the "Dell-W2K d0tcc2 spare" (the Dell #2 box that I brought down this week). This pci expansion "host card" is Bit-3 P/N 85224036 Rev. A SN# 182523. This pci expansion "host card" is the one that Mike and I got out of the spare L2 pci expansion box that Mike found and brought down from the L2 spares area on my last trip here. He still needs to put this pci expansion box back in the L2 spares area. I put this Bit-3 "host card" in the middle of the 3 empty pci slots. This makes "Dell-W2K" box #2 the "primary" spare L2 TCC. At Fermi booted the "Dell-W2K d0tcc2 spare" into the administrator account and it all looked fine. There were no windows about new hardware or any thing like that. Re-booted and logged into the trigger account and it also looked OK. The HP "Workstation xw4200" that has been here for some years and then ran as the L2 TCC from 27-Sept-09 to 7-Oct-09 will remain down here in the cabinet. That is: Dell #2 box, Dell #3 box, and HP Workstation xw4200 SN# 2UA5230VSL will all remain here. For now only HP Workstation xw4200 SN# 2UA5230VSM will go back to MSU. The "Dell-W2K" boxes take a CR2032 battery. Sitting in the flat position the "Dell-W2K" boxes are: 7 1/4" tall, 16 3/4" wide, and 17 3/4" deep. For comparison the HP Workstation xw4200 is 6 3/4" tall, 17 3/4" wide, and 17 3/4 deep. So except for being 1/2" taller in this flat position the "Dell-W2K" boxes are smaller than the HP Workstation xw4200. An HP Workstation xw4200 has been in the L2 TCC position in the M125 rack. In M125 there is currently 7" of vertical spare for the TFW TCC and 7" of vertical spare for the L2 TCC. So we need to make more room. This should be easy as there is 3" of empty space above the pci expansion box. We could reduce this to 1" of space above the pci expansion box and then have 2 slots each 8" tall for the TCCs. The order of equipment in M125 from the top is: TFW TCC, L2 TCC, pci expansion, logic analyzer. In an emergency we could pull out the logic analyzer. Check the externally visible fans and they are OK. M124 is 94.4 or 94.6 I think this is creeping up or the battery is dying or something. This is the relatively hot receiver part of the SCL Hub-End. T962 preamp fan moved down under the bath tub. 1123 1206 1951 1977 1993 3264 ------------------------------------------------------------------------------ DATE: 7,8-OCT-2009 At: Fermi TOPICS: L2 TCC Work, Spare TCC work, Walk Through Inspection Brought the rebuilt original L2 TCC back down to Fermi and between stores at about noon on Wednesday it was put back into service. So far it is running OK. The machine that had been the L2 TCC for the past 11 days was pulled out of service, a Bit-3 pci card was put into it for connecting it to the pci expansion box, the machine was labeled as the spare L2 TCC, and it was put into the storage cabinet. It is 100% ready to quickly be put back into service as the L2 TCC but it has the "new" version of Windows on it and it will drop COOR connections. The Bit-3 card that was put into this now spare L2 TCC to connect it to the pic expansion box had been stored in the expansion box along with the cable that runs between the two. It looked very much like something that Philippe may have packed up "N" years ago. I had not looked at one of these Bit-3 pci expansion boxes in a long time. It is a nicely thought out and very sturdy box by today's standards for computer boxes. Chips on the board in the pci expansion box include: DEC DC1030G, Dalas DS1708, and Motorola MPC951. The machine that was brought down to Fermi last week as the New 2nd Spare L1/L2 TCC was this week made into a ready to run spare TFW TCC. It had the pci card for a Bit-3 617 copper pci-vme card pair put into it. This pci card was P/N 85221511 Rev A S/N ? 186293. This was the pci card from the spare Bit-3 617 card pair in the storage cabinet. Following Philippe's instructions I: booted and log in as "philippe" and let it find the 617. I just clicked "next" through the steps to automatically install the driver for the 617. I double checked that the machine was on Central time and that screen #1 was at 1024x768 and that the desk top was not extended to screen #2. RECALL if this box acts the same as the other modern box then when it boots connected to the KVM switch and the stand up console its screen #1 will default back to 1280x1024 and it will not work with the stand up console. Started the control panel. Selected "Network Connections", "Local Area Connection" and right-clicked for "Properties". Select "Internet Protocol (TCP/IP)" and click "Properties". Give it the IP address 131.225.231.215, netmask 255.255.254.0, gateway 131.225.231.200, and DNS servers 131.225.231.254 and 131.225.227.254. "Ok" all the way out. Logged out of Philippe and Logged in as Trigger Working on the D: drive I copied the whole "D0_Trics_II_Save" directory structure to a new place called D:\Trics\. From D:\Trics\exe\ selected "Trics_II_V11.3_RevA_Bit3_984_V1.3.0.exe" and dragged it to the desk top and made a short cut out of it that I renamed "Trics_II V11.3.A". Double clicked the "Trics_II V11.3.A" short cut and as expected to put out 10**9 errors. While its ascii log window was up I verified that in that window's properties that "Quick Edit Mode" is NOT enabled. "Insert Mode" was enabled and I un-checked that and clicked OK telling it to keep this setting for every time the short cut starts this application. "Windows Security Alert" came up that says that I will need the administrator to unblock this program. With Philippe's help I ran the TRICS from the Philippe Administrator account. This time when the "Windows Security Alert" came up it offered the option to "un-block" the application. I clicked "un-block" and although it did not say so it appears that it not only un-blocks it for this instance but rather that it un-blocks it forever for all users. Logged back in as Trigger and verified that TRICS was no longer blocked from that account. In D:\trics\exe\ right-click-and-drag "Trigmon_II_V2.0_RevD.exe" to the desk top and make a short cut called TriMon_II. I tried running this and it said that it was trying to connect. The instructions for this work are in an email message from Philippe on 8-Oct-09 with the subject "Re: more today ? 4th". Walk through inspection and all visible fans are running. 93.8 to 94.0 deg F at the back of M124. Bring back to MSU the Data I/O PROM Programmer and it old PC. Brought down 10-24 allen hardware for the spare blowers and installed it in the two spare blowers. Brought spare +8V 10A and 5V 18A bricks down to Fermi and put in cabinet. 1866 1863 1791 1993 1203 3082 1977 ------------------------------------------------------------------------------ DATE: 20-SEPT:2-OCT-2009 At: Fermi TOPICS: L2 TCC Work, Collaboration Meeting, Walk Through Deliver the new (2nd) spare TFW or L2 TCC to D-Zero. I put a paper label on it so that it is clear that it is the "new" spare TCC. I put it in the cabinet along with the paper with notes that Philippe prepared for it and gave to me this morning and I put with it a copy of his "Disaster Plan" email note from 13-July-2007 which tells some of the things that you need to do to setup this machine for actual operation. In the cabinet was the old running L2 TCC that failed last Sunday. I put a label on it so that it was clear what computer was what. Also in our cabinet was an Optical Bit-3 PCI card (that had been pulled out of the original spare TCC that was put into service last Sunday). This Bit-3 pci card is: S/N 623605 other tag 85851325 Rev D. The date on its fiber optic TX is: July 2003. This card was sitting on a metal shelf on a nice thick piece of paper. I put it in an anti-static sack and put it in a box in the spare cards storage cabinet along with the other spare Bit-3 stuff. Run "SpinRite" at level 2 on the original L2 TCC. It immediately says, "Maxtor 31024H1, Drive Number: 0, Cable: Primary, Drive: Master, This drive's SMART system is reporting that this drive is in IMMINENT DANGER of complete failure! You should only use SpinRite in it data recovery mode (not deep testing and drive exercising), and probably only after first carefully copying any crucial data this drive may contain. With a drive in this shape, there is a danger that any extensive data recovery work could push it over the edge, causing it to die completely. Please copy any vital data from this drive before proceeding." I called Philippe and he said to give up with SpinRite and just bring the original L2 TCC back to MSU. I took the top off the old L2 TCC computer to listen to the drive and fans. The drive is making a non-constant noise like a bearing noise or it could be the servo working a lot. I can not hear any fan noise or feel much air flow but the cpu's heat sink is not too hot. The disk is crammed up at the top and surrounded on all sides. By design there never has been any chance of air flow around the drive. On the New Spare TCC in both the Philippe and Trigger accounts change: Time Zone to Central, Screen Properties Settings #1 screen to 1024x768 screen #1 is the Primary screen #2 does NOT have the desk top extended to it, screen saver selection is "Blank" after Wait 10 minutes, Power is the Home/Office Scheme with turn off monitor after 15 minutes turn off hard disks Never system standby Never. The Trigger account does not have the privileges required to change the Time Zone. Once set with the Philippe administrator account many of the settings carried over to the non-privileged Trigger account but one thing that did need to be set was the screen saver to select the "Blank" screen saver. I then booted the New Spare TCC and logged into both accounts to verify that these changes have remained. Put the New Spare TCC back in the cabinet along with its documentation described above, and spare keyboard and mouse. Thursday Morning from the running currently privileged Trigger account on the New L2 TCC I set it to central time zone. Thursday Evening from the running currently privileged Trigger account on the New L2 TCC I set it to screen resolution 1024x768 for screen #1 and I then plugged its keyboard, mouse, and display into the KVM switch for the stand up console on M102. That is now working fine but note that the video is still coming out of the #2 side of the video "Y" cable and that we do no know why this is working this way. Friday afternoon - lots of boot trying to make the new L2 TCC wake up with its video system in low resolution so that it will work with the stand up console. This failed. If you set it to low resolution and then boot it with it plugged into the separate table top monitor then it stays in low resolution. If you set it to low resolution and then boot it with it plugged into the switch then you find that it has gone to high resolution. We did get it to send video out the #1 side of the "Y" cable just by plugging the monitor into the #1 side of the Y and booting. Give up. Press for rebuilding the old system with the old operating system. Checked all externally visible fans they are OK. M124 back is 93.8 deg F. T962 running at 60 Gb/day Lower preamp Vee 100 mV 3432 1005 1830 1789 2582 ------------------------------------------------------------------------------ DATE: 27-SEPT-2009 At: MSU, Action at Fermi TOPICS: L2 TCC Dies The L2 TCC died during a store. They noticed the problem at a Run Transition and it took the control room shifters about 40 minutes to figure out that the TCC was dead and that it could not be re-booted or otherwise quickly brought back to life. They contacted: Philippe, Dan, Mike, Selcuk, .... ------------------------------------------------------------------------------ DATE: 16:18-SEPT-2009 At: Fermi TOPICS: Shutdown is over, Work on -3,31 HD Walk Through, 2nd DMA Repaired Bit-3 We had the first 3 post shutdown stores while I was here. So far basically things are running OK. Worked on -3,31 HD. It has been making noise since the first store but not bad enough to require its Exclusion. I watched -3,31 HD many times over the past couple of days. The waveform on the positive side of the differential line is like a series of upside-down "U" stuck right next to each other. - At some level that means that the average was OK and the narrower spikes were downwards (i.e. negative energy) That's why the RMS was screwed up but much of the time the trigger rates looked Ok and were stable. - Once in a while I saw a period where on the rising edge of the upside-down "U" it overshot and went positive. Those were probably the times when the trigger rates changed. - Over the day yesterday the width of the upside-down "U" (period of the waveform) narrowed from 17 usec down to 7 usec. This was basically a monotonic narrowing over time. - This waveform was always there on Thursday. It was still there a 9:30 Thursday night. - On Thursday I saw no change in the waveform with and without the Tev beam. - I doubt that this is sparking inside but Dean should comment. The amplitude was too low for a spark (only 50 mV) and there was no super high frequency stuff. - The positive and negative sides of -3,31 HD looked symmetric so nothing immediately points to a problem with the driver hybrid. - During the time on Thursday when there was beam, besides the upside-down "U" noise waveform, you could also see real Tev energy signals - so the channel is working. - The upside-down "U" waveform was gone when I looked starting at 7:30 Friday morning. I have not seen it yet today. The store that came in at noon on Friday did not make it come back. All externally visible fans are running and things are closed up OK. M124 back temp is 93.3 Having finished testing the 1st TFW blower that was repaired by IMS I brought it back to D-Zero hall and it is stored in the locked cabinet. The 2nd TFW blower that IMS looked at they said that it could not be repaired. We left it with them for hopeful recycle. The 2nd Bit-3 repaired T962 Bit-3 is here in the spares cabinet. 1075 1203 1219 1268 1296 1987 1992 1993 3082 ------------------------------------------------------------------------------ DATE: 26:28-AUG-2009 At: Fermi TOPICS: Shutdown, TFW Supply Brick Fans, M101 Blower is back from IMS, M100 Master Clock Drip Trip, Check the TFW Power Supplies, SES Alarm form M101 Air tested Work with Mentor pld_dmgr using d0sunmsu1 as the display. This still works just fine once all of the fonts are loaded. I loaded them in the following order. Next time I should try skipping the 1999 Mentor fonts and see if it runs OK. xset fp+ /home2/edmunds/Mentor_1999_Fonts xset fp+ /home2/edmunds/Fonts_bst2005 xset fp+ /home2/edmunds/Fonts_Xilinx_41 Put new fans into the +5V, -2V, -4.5V supply from the TFW power supply chassis that failed in the M101 Routing Master a month or so ago. The fans used are: ??? and ???. They installed without much trouble. This is Astec brick SN# The fans plug into P3, P12, and P13. Pin #1 of each of these connectors is the positive side of the 24V supply. P3 has the most direct connection so I would use P3 for the big fan that takes the most current. In the "standard" orientation for looking at these supplies, i.e. looking at the surface of the fans that the air comes out of with the two small fans on the left-hand side, then the pin #1 of these fan power connectors is on the left and it is the positive side of the supply. I mounted the two small fans with their wires coming out at their 4:30 position. The wire come out of the large fan at its 7:30 position. The fan power cables must stay back out of the way for all the modules to fit into the chassis and these cables must not get into the fan blades. I ran the cable to the P3 fan power connector down under the 6/7 pin molex 385V connector. I put on one cable tie. I left a 3/8" piece of the band on this cable tie and stuffed it into the 4:30 hole on the bottom small fan. This keeps this cable tie in position and keeps the fan power cables out of the blades of the small fans. - I put a piece of tape over the screw head that is right under the connector that plugs into the J12 fan power connector. - Yes, tighten the internal screws connecting the high current leads, e.g. going to the filter choke, on the power modules. They were loose on the modules that I checked. I could not get at these screws on the 240 Amp "D" type module. P1 is the 8 pin connector that plugs into the PFC module. Its pinout is about: P1-8 24V Fan Positive P1-7 24V Fan Negative P1-6 Un-installed C24 to chassis gnd then to L1 and from the other end of L1 to L2 in series with parallel C4 & C5 to Ground and also to R17 a 2k Ohm to opto-coupler U2 pin #1 and also to a ton of stuff - it looks like a power feed. P1-5 Ground (not chassis gnd) P1-4 +5V for the 74HC74 (I assume it is 5V and not some other V) +5V goes to the 74HC74 pins 1, 4, 10, 13, and 14. P1-3 Direct to pins #3 and #11 on the 74HC74 P1-2 Direct to pin #2 of each output module connector - all in parallel. P1-1 A bypass cap to Ground and then to pin #2 on opto-couple U5 which is a 4N35 (as are all 4 of the opto-couples on the MB). Connectors P4, P6, P8, and P10 go to the output modules. Their pinout is about: P -1 Ground (not chassis gnd) P -2 Direct to pin #2 on P1 the PFC module connector. P -3 Direct from each output module connector to one side of a jumper pin pair, i.e. JP5, JP6, JP7, or JP8. The other side of these jumpers are all tied together and go no place else. This must be part of the "single wire paralleling" stuff see: www.pa.msu.edu/hep/d0/ftp/l1/framework/hardware/ rack_crate/run_ii_power_pan.txt P -4 To pin #4 of that module's opto-coupler and to the negative side of a small electrolytic cap and a 2k Ohm resistor. P -5 To the positive side of the above electrolytic cap and to the other side of the above 2k Ohm resistor and also off into the guts of the "logic". P -6 Via JP9, JP10, or JP11 to P -6 of the adjacent output module connector. P -7 To the cathode of a diode and a small value resistor in the guts of the logic. P -8 To pin #5 of that connectors opto-coupler. JP1, JP2, JP3, and JP4 all have one side tied in common and it goes no place else. The other side of these jumpers goes to that modules opto-coupler pin #4. That opto-coupler pin #4 has a bypass cap to pin #6 on that coupler and goes off into the glop of logic. The 74HC74 has pins 8 and 12 tied together and pins 2 and 6 tied together. I think that the 74HC74 drives IRFD120 FETs that drive transformers T1 and T2. Output Module to MB connector: J2-1 Directly to pin #4 of 4N35 opto-coupler U5 and via a bypass cap to pin #6 of that opto-coupler. J2-2 Directly to pin #5 of opto-coupler U5. J2-3 Directly to bridged connection E24-E25 then to Test Point #6 and R19, R20, and unlabled pot R21. This must be part of the "single wire paralleling" stuff see: www.pa.msu.edu/hep/d0/ftp/l1/ framework/hardware/rack_crate/run_ii_power_pan.txt J2-4 This must be the negative side of the supply for the control stuff. It goes to a ground plane on the top surface of the card. It goes to: pin #11 of the LM324s, pin #7 of the MC3423, pin #4 of one of the three 4N35 opto-couplers (U7), and to pins #10 and #12 of the UC2825. J2-5 This must be the positive side of the supply for the control stuff. It goes to: pin #4 of the LM324s, pin #1 of the MC3423, and to pins #13 and #15 of the UC2825. J2-6 Directly to a via labeled E1 and then no place else. J2-7 Directly to a small cap C22 in series with R66 then CR7 going to ground and R63 going to ground and finally the center pin on Q1 a 2N2222. J2-8 Directly to a CR9 in series with R34 to the positive side of the control supply and to pin #8 of the UC2825 and to an RC to ground, and via diode and then Rs and Cs to pins #8 and #9 of an LM324. Another interesting feature is that the -2V and -4.5V bricks, i.e. Astec module types A0 and A2 are called 300 Watt 60 Amp max modules. But internally they are labeled as 600 Watt 120 Amp modules, i.e. B0 and B1 modules, that have be "derated" - I assume that means their max current adjustment set down. The transformer, rectifiers, filter inductor, and output capacitors all appear to be the same in 1/2 of the 240 Amp D type module and in the A type modules. In general it looks like a hard crowded air path through these supplies. Each of the modules both "A" and "D" types have 4 onboard pots. Two of the pots are relatively easy to get at from the panel and are labeled. One of these is the output voltage adjustment, and the other is labeled "OL" which I assume means over current and not an OVP trip adjustment. A third one is accessible with some effort from the panel. The forth one is mounted vertically and you can not get to it without taking the supply apart. The manual talks about 3 pots that they call, "adjustment access: voltage, power fail, and over-current". The transistors for driving the transformer that drives the main switches are JE243 and JE253 from Mot. I think that the main switches are from IR. The current in the main transformer is measured with a current transformer. I.E. there is isolation between the control stuff and the main switches and then isolation between the main switches and the outputs, and there is a optic-coupler isolation. There is a 3 terminal Astec part AS431W. The main rectifiers are IR type 182NQ030 or 1B2NQ030. It must be a forward converter - the transformer has a diode going to the L input filter and there is a diode from the input side of the L to the other side of the output. I think that the L is in the negative output lead. It's just a one LC main output section with a little ferrite after that. By far the PFC module looks the most complicated. All aluminum electrolytic caps appear to be 105 deg C. I would like to get a small resistor load to test the TFW supplies here at Fermi with some load to keep things honest. 25 Amp at 5 Volts or fold it in two for 20 Amps at 2 Volts would be nice. 0.20 Ohm Something much easier to carry around than the big old "Toaster". Blower. I got back the 1st of our TFW type blowers from IMS. This is the rear blower from M101 that failed back in February of 2009. It clearly has new bearings in it. Other things I noticed: the motor shaft looks off center, the set screws from the squirrel cages are in new places, one of the squirrel cages is not in the center of its duct, and the whole thing is dirtier than when it was sent out. I will bring it back to MSU for a long test run and if all is OK bring it back to D-Zero as a 2nd spare. M100 Master Clock Rack Drip Sensor Trip. Thursday afternoon at about 2:10 PM John Foglesong was looking in the back of M100 for the RG-58 cables that Steve Chappa moved down under the floor a month ago. By moving things John caused a Drip trip just like Steve did a month ago. This time we dug into it and there was clearly a broken plastic housing on the cable connector that fits onto the drip strip at the back of the rack. We could easily make it trip just by very gently touching anything near this cable. John and Mike Cherry put on a new housing on the connector and then poked at things for a while to make certain that there were no other problems. Master Clock, TFW and such were back up running at 15:30. 900 Hz ZB by 16:45 with very few crates in the run. One interesting problem was with the L1 Cal Trig. We had two SES Alarms "L1CAL_TABGAB_M107EVT_FAILED_BC_HEAD_TAIL" AND "L1CAL_TABGAB_M107EVT_FAILED_ _GAB_STATUS". Also L2 Cal Trig 0x23 would not work. The problem ended up being that the SCL Receiver on the Saclay SCLD card had no started back up when the Master Clock and TFW started sending out the SCL stream again. L1Cal TCC was also saying that it "failed to capture monitoring data". I just pushed the button at the top of the SCLD card and the SCL Receiver locked back on (its amber/yellow LED came on). It was then fun to watch the active pedestal control go to work and get things lined up. Power supply checks. See 18:20-OCT-2007 and 13,14-Sept-2007 for the most recent detailed check. Measure on the back of the backplane measured wrt the upper ground bar on the P2P3 backplane * -> measured at the terminals on the brick (because you can not reach the backplane) +5.0V +3.3V -2.0V -4.5V ------- ------- ------- ------- M122 Top 5.051 3.334 2.014 4.509 M122 Mid 5.030 3.335 2.030 4.503 M122 Bot 5.060 3.348* 2.024 4.525 M123 Top 5.032 3.328* 2.014 4.502 M123 Mid 5.044 3.337* 2.014 4.505 M123 Bot 5.038 3.333 2.017 4.506 M124 Top 5.024 3.348* 2.018 5.144 M124 Bot 5.012 3.322 2.043 5.233 Now measure at the TOM card front panel test points wrt the rack ground in M122 and M123 +5.0V +3.3V -2.0V -4.5V ------- ------- ------- ------- M122 Top 5.041 3.332 2.003 4.504 M122 Mid 5.023 3.333 2.023 4.503 M122 Bot 5.051 3.332 2.015 4.520 M123 Top 5.021 3.334 2.003 4.497 M123 Mid 5.034 3.335 2.004 4.503 M123 Bot 5.028 3.332 2.006 4.503 Now check the L1 Cal Trig Readout crate. Measure it by probing the terminal blocks at the back of the backplane. +5.0V +3.3V -2.0V -4.5V ------- ------- ------- ------- M101 Top 5.050 3.342 2.103 4.584 Now check the Routing Master crate. Measure it from the slot #1 TOM card test points wrt rack ground. +5.0V +3.3V -2.0V -4.5V ------- ------- ------- ------- M101 Bot 5.034 3.313 2.095 4.611 Probably the scarriest things is the -2.095V in the Routing Master crate but it is running OK so I do not think that I will touch it, rather just watch it for now. -4.5V in M123 Top could get bumped up 10 mV. Things look very stable compared to the numbers from the careful checks in Sept and Oct 2007. Petr Neustroev is back at D-Zero for 3 months. He is the person who verifies / reloads the programmable devices in the hall in the muon system. Vladimir Sirotenko added the stuff to the SES Epics system so that we now will have an alarm if the M101 air flow stops or if it gets about 105 F in the rack. We tested it and he saw the SES Alarm OK. The name of this SES Alarm is: L1TFW_LV_M101/AIR-TEMP He is going to add my guidance text for this alarm. Checked the equipment at MuMi. Replaced the aluminum box cover for the VME DMA "Fixer" with a normal G10 cover. Now both the DMA Fixer and the aluminum box cover are stored in the spares cabinet. 1075 7651 ------------------------------------------------------------------------------ DATE: 12:14-AUG-2009 At: Fermi TOPICS: Shutdown, Wiener supply at PAB, TFW supply repair, Meter Check, NuMi Bit-3 swap, M101 Alarms, Voltage Monitor and Shea Box pinout, How to see raw data from the Shea Rack Monitor Boxes Cal-Ops meeting. -3,26 HD has been noisy a couple of times recently. They think that it looks like about 400 nsec noise but is it really Tev locked or just something about 400 nsec ? Added the diode upgrade to Wiener supply SN# 3998075 from PAB. It is the one that is so old that it does not have a hole in the back for the alignment pin. That means that it can not be used as a spare for the D-Zero or NuMi systems. This is the supply that came from Saclay. I need to ask Wiener if it is OK to drill a hole or are these supplies different enough that they can not be used with modern Wiener crates (and thus the alignment pin is meant to keep them out via keying). Pulled apart the +5V,-2V,-4.5V supply from the TFW power supply chassis SN# 11 that failed in the Routing Master on 21-July-09. This is Astec supply SN# 99420440. The intent is to replace the fans in this supply because they have been running for the past 10 years. This is a test to see if we can replace these fans. I checked it before pulling it apart and the 3 fans were all running OK and feel fine. I.E. this is completely different than the +3.3V supply that came out of this TFW power supply chassis that had 2 clearly bad fans. The fans are held in via self tapping screws with large pitch threads, i.e. for plastic. The frame of the replacement fans needs to be plastic. The three places where the fans plug in are all equivalent 24V supplies. The small fans were plugged into J3 and J13, and the large fan was plugged into J12. The fan cords have two pin 0.1" space 25 mil sq pin connectors on them with polarized housings. I *think* that the AMP contacts may work in the existing housings. For reference the PFC module appears to use: UC3854 plus the normal analog suspects, e.g. LM324 and LM353 plus some digital stuff, e.g. CD4049, 74HC161, and 74HCU04 and lots of 4N35 couplers. Recall that the PFC module comes off to the side. The output modules appear to use: UC2825, LM324, MC3423 and 4N35 couplers. Recall that the output modules come off by pulling up. I checked the Fermi 8060A Fluke meter with an MSU Volt-Box. On the 20 Volt scale it is ready correctly to the mV with 5V and 10V input. With a 100 k Ohm series resistor and the Volt-Box at 5V it reads 4.950 Volts which is the correct number for its 10 Meg Ohm input. So the Fluke meter at Fermi is still in good shape to check supplies. Replaced the Bit-3 at NuMi. Pulled the un-modified Bit-3 T962 #1 SN# 826467 with its MSU "fix dma" adaptor and installed the Bit-3 with the modified U29 T962 #2 SN# 826434. T962 #1 will come back to MSU for a trip to Adren Hills Minnesota. I left the large extended front panel aluminum cover installed at NuMi. I need to make a new block off panel that is I think 4 wide, i.e. about 76 mm wide. For now I will leave the J1 & J2 "fix dma" adaptors at D-Zero. Work on adding the air flow monitor to rack M101 where the Routing Master and the L1 Cal Trig Readout are located. The purpose of this is to set an SES Alarm if the blower in rack M101 fails again. I will draw the circuit in page 165 of my note book #10 but it is just: - Get 5V from the middle crate in M101. - Do this through a 100 Ohm 2 Watt resistor to limit any fault current to 50 mA. - This current limited 5 Volts goes through an air flow sail switch and a 105 deg F thermo switch. - The output of these series switches lights an LED lamp if everything is OK, i.e. if the air is blowing. - The output of these series switches is pulled low by an 1,100 Ohm resistor to ground. This pull down resistor is to guarantee that the output of the series switches goes to ground when one of them opens. - The output of these series switches goes through a 1,100 Ohm resistor and then to the + analog input of a Shea Box. - The - analog input of this Shea Box channel is grounded. - With the switches closed I see about 3.6 Volts. The 1.4 must be drop across the 100 Ohm current limiter cause by the LED current and the 5 mA through the 1,100 Ohm to ground. - I will ask for a trip point for the Alarm at 2.5 Volts. - I have also put a 1 uFd Tant cap across this right at the Shea Box analog input to reduce the change that noise causes a false alarm. - This signal from the air-flow thermo switch is plugged into Shea Rack Monitor Box Address 26 Analog Channel 32. - This analog channel should show about 3.6V when things are OK and about 0V when there is a problem. A 2.5V alarm level should be fine. The air flow switch and thermo switch are located in the right hand side of M101 Middle crate when looking from the front. The LED indicator is on the blank out front panel in that holds the bracket that holds these two switches. The LED is labeled. Recall that lower down there is an LED that shows that the blower motor has power (located on the front panel of the blower chassis). But when there was a failure a couple of months ago the blower motor still had power but it was not turning fast enough to blow any air because of frozen bearings. Part of the problem of setting this up was remembering how the analog inputs of our Voltage Monitoring system work. The problem is that we had no documentation about the pin out of the adaptor boxes that go between the Voltage Monitoring Cables and the Shea Box Rack Monitor. I've know that we did not have this documented since setting up for Run II. But we got away without this documentation because we had the "end to end" stuff documented, i.e. which Voltage Monitor cable number showed up in which Shea Box channel number. I found the Shea Box Rack Monitor document and I stuck it in our Run I document pile at the following URL. I did this without spin. www.pa.msu.edu/hep/d0/ftp/run1/l1/safety/shea_box_rack_monitor.pdf I then probed one of our Voltage Monitor Adaptor Boxes (VMABs) to determine its wiring. It is straight forward and rational. The connectors on our Voltage Monitor Adaptor Box are 9 pin and they carry 4 differential analog signals and a ground. A column of 4 of these connectors on a VMAB goes to one connector on a Shea Box, e.g. connectors A, B, C, and D go to connector J10 on the Shea box. Each Shea box connector handles 16 differential analog inputs. The pin out of a connector on our VMAB is the following: pin #1 +IN 1st channel pin #6 -IN 1st channel pin #2 +IN 2nd channel pin #7 -IN 2nd channel pin #3 +IN 3rd channel pin #8 -IN 3rd channel pin #4 +IN 4th channel pin #9 -IN 4th channel pin #5 chassis ground The top connector in a column, e.g. connector A, on the VMAB goes to the first 4 analog inputs in a Shea box connector. The next VMAB connector, e.g. connector B, goes to the next 4 inputs in the Shea Box connector. As a specific example look at VMAB connector A: VMAB Connector A Shea Box J10 Shea Box Pin Number Pin Number Function ---------------- ------------ ------------- 1 1 Ch #32 +IN 6 20 Ch #32 -IN 2 2 Ch #33 +IN 7 21 Ch #33 -IN 3 3 Ch #34 +IN 8 22 Ch #34 -IN 4 4 Ch #35 +IN 9 23 Ch #35 -IN The mapping of which column of VMAB connectors goes to which Shea Box connector is shown on the top of the VMABs. You could plug the VMAB columns into the Shea Box connectors in any order but there is a standard configuration that we have used since Run I that is shown on the top of the VMABs. This step appears to have been picked to minimize the criss-cross of cables. Friday afternoon with Geoff we tested the new M101 air flow & temperature sensor. Our two Shea Boxes (addresses 25 and 26) are on computer d0olctl131 1553 bus #1 of that computer. To look at this you do a > setup d0online > RM.py d0olctl131 From that display you can then pick which 1553 bus, and which address, and what units you want the data displayed in. The other Voltage Monitoring documentation that we have on the web is in the Run IIA L1 Cal Trig Rand and Crates stuff and in the Run I Safety stuff. There may be more but those are the two places that I know to look. ------------------------------------------------------------------------------ DATE: 26:29-JULY-2009 At: Fermi TOPICS: Shutdown, Power Outage, M101 Blower Replacement, Check M101, M122, M123, M124, Nth try fixing M122 Top Crate Hangs, Routing Master Power Supply, Check Bit-3 VME Card, Finish L1 Cal Trig Wiener "Diode Upgrades", Check L1 Cal Trig Comm Crate Spare Supply Operation, Measure the EM3 Coated Board Resistance, NuMi Wiener Supply When I got here some one had turned off our MCH-1 LCD display console. M101 Air Blower Replacement Work to replace the rear blower in the Routing Master, M101 bottom crate. The setup of cards in the Routing Master is: Slot #1 Vertical Interconnect on TOM SN# 06 Slot #3 RM SBC on TOM SN# 02 The Ethernet cable goes into the lower RJ-45 connector on the SBC. This TOM has no VME Terminators, no chips in its Timing and Control Bus section, no chips in its front panel section, and it has a jumper wire from P1-16D to P2-1E. I think that this jumper is part of the Bus Grant #3 IN wiring to skip over empty slot #2 (and that's why there is a backplane jumper wire too). Slot #5 FOM-19 "THE" Card Green with a front paddle card of single signal cables. slot #7 AONM-12 "THE" Card Orange with a front paddle card carrying 4 twist-and-flat cables. slot #9 FOM-25 "THE" Card Green with a front paddle card carrying 4 twist-and-flat cables. Replacing the rear blower went OK. What would make it a lot easier is if the screws that hold the blower to the pan were hex head instead of slot head. These are 10-32 so I need to order some hex head 1/2" 10-32 screws and keep them here just in case another blower needs to be replaced. During the installation of the new blower having the nuts glued in place helps a lot. Check: Blowers, Radiators, and Hoses I checked through the blowers and the radiators and the hoses in the racks before turning things back on after the power outage. All of the hoses look OK - none of them are showing any cracks. M122: I could not access the front blower without pulling apart some cabling so I did not check it. The rear blower feels OK when I turn it by hand. There are 2 radiators in M122 and they both have mechanical clamps over the sections that are likely to leak. M123: The front blower feels OK. The rear one feels stiffer than the others but turns smoothly. Note that there is (and has been for many years) more noise when the blowers start up in this rack than in the other racks. There are 2 radiators in M123 and both of them have been soldered over in the section that is most likely to leak. M124: I could not access the front blower to check it without pulling things apart so I have not checked the front blower. The rear blower feels OK. The small commercial blower tray below the 6U VME TFW "control crate" is all OK. There is 1 radiator in this rack and it has a mechanical clamp over the section that is likely to leak. M101: Both blowers feel OK. This rack has one radiator and it has neither mechanical clamps or has it been soldered over in the likely to leak spot. M122 Top Crate Bus Hang work: In an attempt to explore the M122 Top Crate hang problem some more, the TOM card that has been in use in this rate for the past many years, i.e. TOM SN# 16, was replaced with TOM SN# 11. TOM SN# 16 has had a 200 nsec delay in its AS* signal since 13-Mar-07. This started as a 100 nsec delay of AS* on TOM SN# 16 on 11-Aug-06. I installed TOM SN# 11 which has both a 200 nsec delay in its AS* signal and a 200 nsec delay in it DTACK* signal. This delay in the DTACK* signal is the same as what was required to prevent the DAVE "THE Cards" from hanging in the Routing Master (but its very hard to imagine that the cause of the hang in the RM and the cause of the hang in M122 Top are the same). The TOM that I pulled out of service (SN# 16) will stay here in the spare card cabinet. TFW was back up and running by about 13:00. Folks had about 12 crates running OK in ZB by 15:00 or so. Then Steve tripped off the Master Clock so we got to start everything back up. RM Power Supply The RM Power Supply that died last week on 21-July-09 was TFW Power Supply Chassis Number 11 which consists of 3.3V module SN# 99420448 and 5V,2V,4.5V module SN# 99420440. It is the 3.3V module that is dead and it appears like the problem may be its PFC input section. The 3.3V module is an Astec model VS3-D1-D1-20 (-CE) The TFW power supply that was installed in the Routing Master last week is Chassis Number 2. Because it is installed I can not read the SN# of the modules in this chassis. I assume that it has the modules indicated in the TFW Power Supply inventory list at: www.pa.msu.edu/hep/d0/ftp/l1/ framework/inventory/run_2_power_supply_inventory.txt The TFW Power Supply that I brought down here as a 2nd spare is Chassis Number 1 which have 3.3V module SN# 96390265 and 5V,2V,4.5V module SN# ?. I took apart the dead power supply TFW power supply chassis number 11 and removed its dead 3.3V module SN# 99420448. One of this modules two fans on the PFC input section is completely frozen. The big fan on its output sections clearly has very bad bearings. A big concern is, are the internal fans in all of the VS3 supplies in this bad a shape ? If so then we need to replace them. Current L1 Cal Trig Bit-3 VME card I checked the Bit-3 VME card in the Communications Crate in the L1 Cal Trig. It is: in house label "Spare L1 Cal" Bit-3 labels SN 588596 85853435 Rev E optic TX date Nov. 2002 See the log book for 1:3-APR-09 for other Bit-3 inventory information. L1 Cal Trig ADF Crate and Communications Crate Power Supplies Mike Matulik and Mike Cherry have completed the "Diode Upgrade" of the Wiener Supplies for the L1 Cal Trig ADF Crates and Communications Crate. The current setup of power supplies is: ADF Crate Wiener Supply SN --------- ---------------- A 5196006 B 5196008 C 5196009 D 5196007 The spare ADF Crate supply is SN 5196010. All the supplies just moved down the chain a step. The Communications Crate supply is SN 3295142. Recall that the Communications Crate Wiener power supply chassis has only +3.3V and +5V VME supplies in it. I did test and verified that an ADF Crate Wiener power supply can be used in the Communications Crate. The ADF supply in the Comm crate starts up and runs OK. When you look at the voltages you see the +3.3V and +5V supplies putting out their nominal voltages. When you look at the analog +-5V supplies they are putting out 0 Volts. I guess that maybe the power supply chassis looks at the "Bin Memory" and decides that it does not need to start the analog +-5V supplies. EM3 Sample of Circuit Board DC Resistance Measurement Marvin and I measured the resistance of a sample of EM3 circuit board. It has resistive coat on both sides. Marvin put 6 clips (the real clips that he got from John Najdzion) on both ends of the sample and there are orange wires coming from both ends. The sample of circuit board is about 32 1/2" long and about 6 1/4" wide (the coated part). This is about 5.2 squares. For these tests we are using a 0-60V Lambda supply and an HP 3457A meter. We will try first just using the HP meter as a uAmp meter. It reads 0.002 uA with no input. Monitor the Lambda output with a Fluke multimeter. For the first test just look at one of the 100 Meg Ohm Resistors to verify that this setup is working as expected: Lambda Fluke HP ==> Volts Volts uAmps Meg Ohms ------ ----- ----- -------- 1.0 1.23 0.014 102.5 5.0 5.18 0.054 99.62 10.0 10.17 0.104 99.71 20.0 20.09 0.203 99.95 40.0 39.93 0.402 99.83 59.8 59.55 0.599 99.75 Now connect up the piece of EM3 circuit board to measure it: Lambda Fluke HP ==> Volts Volts uAmps Meg Ohms ------ ----- ----- -------- 0.0 0.19 0.002 - 1.0 1.23 0.004 - 5.0 5.23 0.010 - 10.0 10.17 0.023 484.3 20.0 20.07 0.055 378.7 40.0 39.95 0.125 324.8 59.8 59.61 0.206 292.2 As we sit at the 60V setting the 0.206 uA number starts drifting down almost immediately. After 5 or 10 minutes it is down to 0.142 uA and the indicated current is still going down. OK, first I want to verify that we are not seeing any leakage between the two orange wires that I have twisted together. I disconnect one of the orange wires from the sample and I see: Lambda Fluke HP ==> Volts Volts uAmps Meg Ohms ------ ----- ----- -------- 60.0 59.79 0.002 - That's good - at this scale the insulation on the orange wires is perfect. Next I wipe both sides of the EM3 sample a number of times with ethanol absolute and kleenex. After cleaning the sample with ethanol start monitoring it again at 60 Volts. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.0 59.77 0.137 11:28 442.7 " 0.130 11:36 466.9 " 0.124 11:42 489.8 " 0.119 11:48 510.8 OK now as an experiment switch the polarity across the circuit board sample, i.e. switch the connections to the orange wires and start monitoring it again with 60 Volts across it: Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.0 59.76 0.130s 11:49 " 0.180s 11:51 " 0.190s 11:52 " 0.170s 11:54 " 0.161 12:02 375.8 " 0.131 12:15 463.3 " 0.125 12:34 485.9 " 0.108 13:38 563.8 " 0.097 14:30 629.1 " 0.099 15:55 616.1 " 0.107 17:31 569.1 " 0.098 18:00 622.5 OK, now gently warm the EM3 sample in front of an electric foot warmer / room warmer. We did not let the sample get hot. Rather there was just warm air blowing over it. Warm it from both ends over about 45 minutes trying to de-humidify it. The point of this is to: verify there there is no big change in the resistance of the sample after warming it and also to verify that this technique that we will use tomorrow after the sample has been in LN2 does not hurt anything. Return to monitoring the sample at 60 Volts I see current numbers in the 0.095 to 0.110 uAmp range over a period of 15 minutes or so. I.E. there was no change from this cooking. OK, now try putting the sample in the styrofoam insulated aluminum tray that will be used to hold the sample tomorrow when it goes into the LN2. The circuit is: Lambda black negative output to: Lambda white ground terminal HP 3457A LOW input terminal the aluminum LN2 tray Fluke multimeter - input Lambda red positive output to: one side of the sample Fluke multimeter + input The other side of the sample goes to the HP 3457A I terminal. Now that it is in the tray start monitoring it again with 60 Volts across the sample. The current numbers are in the same range as able, i.e. 0.095 to 0.110 uAmp. I.E. we can monitor the sample in the aluminum tray OK. Now finally try moving the 6 clips at each end. The bus wire was soldered to the clips while they were on the EM3 sample so there is a little concern that this soldering may have done something to the resistive coat that was right under the clip while it was soldered. Pull the clips off and move them over by about 1.5 clip widths. Go back to monitoring with 60 Volts across the sample. This makes a big difference. now we have currents in the 0.5 uAmp range and the number are more stable. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.0 59.73 0.509 20:41 117.8 " 0.508 21:08 118.0 Add 5 more clips at each end to help make certain that we have a good connection to the resistive coat. There are now 11 clips at each end. The EM3 board sample is in the aluminum tray that will be used for LN2. Make a voltage scan at about 10PM Tuesday. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.0 59.73 0.525 22:xx 114.2 40.0 39.87 0.350 22:xx 114.6 20.0 20.07 0.177 22:xx 114.7 10.0 10.19 0.091 or 0.090 22:xx 113.9 5.0 5.18 0.047 22:xx 115.1 2.5 2.72 0.025 22:xx 118.3 1.0 1.24 0.012 or 0.013 22:xx 118.1 0.1 0.02 0.002 22:xx - Let the setup set over night with the Lambda set at 10 Volts. Make a voltage scan at about 7AM Wednesday. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.0 59.73 0.524 0r 0.525 07:xx 114.3 40.1 40.00 0.351 07:xx 114.0 20.0 20.01 0.176 or 0.177 07:xx 114.7 9.9 10.01 0.089 07:xx 115.1 4.8 5.00 0.045 or 0.046 07:xx 114.9 2.3 2.50 0.023 or 0.024 07:xx 116.3 0.8 1.00 0.010 07:xx 125.0 0.1 0.00 0.002 07:xx - Let the setup sit for an hour at 10 Volts and then make another voltage scan. All of this work is at RT. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.0 59.73 0.524 08:xx 114.4 40.1 40.00 0.351 08:xx 114.6 20.0 20.00 0.176 08:xx 114.9 9.9 10.00 0.089 08:xx 114.9 4.8 5.00 0.045 08:xx 116.3 2.3 2.50 0.023 or 0.024 08:xx 116.3 0.8 1.01 0.010 or 0.011 08:xx 118.8 0.1 0.00 0.002 08:xx - Now take the setup outside by the large LN2 tank. First make another voltage scan at Room Temp. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 59.9 59.79 0.522 09:30 114.8 40.1 40.00 0.349 09:30 114.9 19.9 20.00 0.174 or 0.175 09:30 115.3 9.9 10.01 0.087 or 0.088 09:30 115.7 4.8 5.00 0.044 09:30 116.3 2.3 2.50 0.023 09:30 113.6 0.8 1.00 0.010 09:30 111.1 0.1 0.00 0.001 09:30 - Now fill the aluminum tray with LN2. This was done over a period of only about 5 minutes, i.e. the EM3 sample was cooled rather quickly. Things are not boiling too badly. Let things settle for a couple of minutes and make a scan. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.0 59.77 0.269 09:45 223.0 40.1 40.01 0.180 09:45 223.5 20.0 20.01 0.090 09:45 224.8 9.9 10.01 0.045 09:45 227.5 4.9 5.00 0.023 09:45 227.3 2.4 2.51 0.012 09:45 228.2 0.8 1.00 0.005 or 0.006 09:45 222.2 0.1 0.00 0.001 09:45 - Now while it is still at LN2 temperature make a second voltage scan just to check things. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.0 59.77 0.270 09:55 222.2 40.1 40.01 0.180 09:55 223.5 20.0 20.01 0.090 09:55 224.8 9.9 10.01 0.045 09:55 227.5 4.9 5.00 0.023 09:55 227.3 2.4 2.51 0.012 09:55 228.2 0.8 1.00 0.005 09:55 250.0 0.1 0.00 0.001 09:55 - Disconnect the sample, remove it from the LN2, and put it in a black plastic bag to let it warm up to room temperature. After it is warm remove it from the bag and let it dry in the sun. It got warm in the direct sun. We let both side of the sample cook in the sun. We emptied the LN2 tray and let it warm up and dry out. We put things back together, i.e. the sample is back in the tray and tied up. Make one measurement now and then let things settle down for about 1 1/2 hours. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.0 59.77 0.552 10:30 108.5 We have now waited about 1 1/2 hours and will make a room temperature voltage scan before cooling for the 2nd time. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.1 59.74 0.547 12:10 109.3 40.1 40.00 0.363 12:10 110.3 19.9 20.00 0.181 12:10 110.8 9.8 10.01 0.091 12:10 110.6 4.7 5.00 0.046 or 0.047 12:10 108.7 2.2 2.51 0.024 12:10 106.8 0.7 1.00 0.010 12:10 105.3 0.3 0.00 0.000 or 0.001 12:10 - Now cooling it by adding LN2 rather quickly to the tray. It takes about 5 minutes to fill it with LN2. Now make a cold voltage scan. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.1 59.76 0.304 12:35 197.2 40.1 40.00 0.202 12:35 199.0 19.9 20.01 0.101 12:35 200.1 9.8 10.04 0.051 12:35 200.8 4.7 5.01 0.025 or 0.26 12:35 204.5 2.2 2.51 0.013 12:35 209.2 0.7 1.00 0.006 12:35 200.0 0.2 0.01 0.001 12:35 - While it is still at LN2 temperature make a second voltage scan. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.1 59.76 0.303 or 0.304 12:45 197.6 40.1 40.00 0.202 12:45 199.0 19.9 20.01 0.100 12:45 202.1 9.8 10.04 0.050 or 0.051 12:45 202.8 4.7 5.01 0.025 or 0.26 12:45 204.5 2.2 2.51 0.013 12:45 209.2 0.7 1.00 0.005 or 0.006 12:45 222.2 0.2 0.01 0.001 12:45 - While it is still at LN2 temperature let it site for a while with 60 Volts across the sample and see if anything changes. Lambda Fluke HP ==> Volts Volts uAmps Time Meg Ohms ------ ----- ----- ---- -------- 60.1 59.78 0.303 or 0.304 12:56 197.6 60.1 59.77 0.303 or 0.304 12:59 197.6 ------------------------------------------------------------------------------ DATE: 21,22-JULY-2009 At: MSU action at Fermi TOPICS: Routing Master Power Supply Fails Tuesday evening at about 10 PM I was paged by Bill. He told me that the Routing Master power supply (bottom crate in M101) had failed and that it showed all 4 of its outputs in alarm. They had tried to cycle it once and it did not come back on. The decision was to wait until Wednesday morning to replace it. Wednesday morning I send Mike Matulik a note and called him to discuss the replacement of the Routing Master supply. Mike and Mike dropped their other work and made the replacement right away. Once the hardware was powered up again we worked from here with the DAQ shifter to get the Routing Master running. The DAQ shifter issued the Stop command to the Routing Master (so that we could Configure its FPGAs) but it was clear that the SBC in the Routing Master was not paying any attention. Bill pushed the front panel button on the RM's SBC and then it obeyed the Stop command and we configured its FPGAs. From there on the rest of the DAQ system started up without any problem. I need to write up a general procedure for replacing Run II TFW type power supplies. We have had only one other failure which was the M123 middle crate supply that failed on Sunday evening February 29th 2004. I need to bring another spare TFW type power supply down to Fermi. ------------------------------------------------------------------------------ DATE: 8:9-JULY-2009 At: Fermi TOPICS: Shutdown, Purple Haze Noise work, ADF-TAB Link Error ATC, Wiener Supply Diode Upgrade, Feedthrough Leak, Comm Crate Setup Marvin gave Malter current and Purple Haze Noise talks in the Calop meeting. Talked with Pete Simon - he and other mechanical crew folks have looked around for the Central Cal HV terminal blocks and pins and have not found them. The only chance may be to find some spare CC modules that have been stored in some radiation material storage place on site. No one remembers if there are any spare modules besides the ones in the NWA test cryostat. From the Clean Room log book it looks like the work to mechanically stack the CC modules in the CC cryostat was finished by November 1989. The welding on of the inner heads was started by late June 1990. This sets a window of dates when the inner cabling was done. I visited Fermi Visual Media Services and they let me see every picture that they took during this 8 month period. The following are the Fermi Image Numbers of the CC assembly in the clean room. None of these are close up technical photographs. 90-014 through 90-031 taken on 4-Jan-1990 90-588 through 90-601 taken on 26-Apr-1990 90-734 through 90-744 taken on 16-May-1990 90-778 through 90-783 taken on 16-MAY-1990 Send the specific proposal and request to proceed note for the welding over the gap experiment to George. I should have also sent it to Walter. We have had a Tab #7 Chip #3 Input #2 ADF to TAB link error for the last week or so. Selcuk has tried replacing the LVDS cable but that did not help. When he moved the cables around the problem stayed with the ADF/ATC output. This link is driven from the ADF/ATC in ADF Crate "B" slot 13. - Before doing anything we tried re-Configuring the ADF Crates and Initializing the system. This did not help. - The problem output was the center LVDS cable output on the ATC. No matter which of the 3 LVDS cables we plugged into this output the TAB receiver at the end of that cable showed errors every 5 seconds and the other 2 cables - TAB receivers looked OK. - Working slowly we discovered that if we unplugged the PFC cables (both of them) then all 3 LVDS cables - TAB receivers were happy. This was a good solid test. We could repeatedly plug and unplug the PFC cables and the errors on which ever cable - TAB receiver was plugged into the center LVDS connector would show up / go away. - We pulled out the ATC card that has been in the system for the past N years (it is ATC SN# 012) and installed ATC SN# 074. It has the same problem but this time it is with its top LVDS connector. If either PFC cable is in the errors on which ever TAB receiver is plugged into its top LVDS connector. If both PFC cables are unplugged then all 3 LVDS connector outputs are fine. - Try a third ATC card. This one is labeled "ATC Spare #8". It works fine - all 3 LVDS links look fine with the PFC cables plugged in. What is wrong with ATCs SN# 012 and 074. Have we seen this kind of problem before ? This must have something to do with the grounds. Is the "front panel" shorting the two ground planes on the ATC card together ? Is the shell on the PFC cable connector touching the front panel ? A thing that I noticed is that ATC SN#12 top PFC connector is hard to get the cable plugged into - you have to hold down the latch on the cable connector to get it to go in. Should we try running one of these ATC cards without it panel ? Mike and Mike have added the "diode upgrade" to the spare ADF Crate Wiener supply and on Thursday they swapped the spare supply into ADF Crate "A" (where it will now stay) and now they will upgrade the supply from Crate "A". I checked the setup on the L1 Cal Trig Communication Crate. This supply looks like it has only +5V high current and +3.3V high current bricks in it. Some points of the setup are rather strange. During normal operation I see a draw of about 14 Amps from the +5 Volt supply and a draw of about 1 Amp from the +3.3 Volt supply. Current Comm Crate Setup: Communication U0 U3 Crate +5.0 V +3.3 V Setting Digital Digital ------- ------- ------- Ilim 30 A 50 A Uadj 0 % 0 % Unom 5.00 V 3.29 V OVP 6.25 V 4.50 V Imax 25 A 45 A Umin 4.50 V 3.00 V Umax 5.50 V 3.50 V Fans 3000 rpm CAN-Bus Address = 6 Notes: Ilim is the supply's output current limit. Uadj is the fine adjustment on the supply's output Voltage. Unom is the coarse adjustment on the supply's output Voltage. OVP is the trip point for the supply's Over Voltage Protection. Imax is the maximum current that may be drawn from the supply and still have it report "good" status to the monitoring. Umin is the minimum output Voltage that can be coming from the supply and still have it report "good" to the monitoring. Umax is the maximum output Voltage that can be coming from the supply and still have it report "good" to the monitoring. The +5V digital supply and the +3.3V supply are both 115A bricks Sidewalk cabinet combination. Swap supplies in the Top T962 VME crate. It now has a diode upgraded supply. Walter says that the new feedthrough failed the leak test. 3291 ------------------------------------------------------------------------------ DATE: 24:26-JUNE-2009 At: Fermi TOPICS: Shutdown, Purple Haze Noise work, MCH water leak work, Wiener supply work Wednesday 24-June-09 Worked on the Purple Haze Noise with Marvin and Dean. 1. Verified that we could turn ON and OFF the PHN by powering problematic calorimeter element in the various ways: - power from LAR5N, float LAR5S see no PHN (this mode is currently used for Physics) - power from LAR5N, connect but not power LAR5S see PHN of one sign - power from LAR5S, connect but not power LAR5N see PHN of the other sign We looked at Trigger Towers: +5,22 +6,22 +5,23 +6,23 to see the PHN. Before starting this set of tests Dean installed Trigger Summers in these BLS cards with no cut resistors. The scope was set for 200 mV/Div and 4 or 10 usec/Div to get a full view of the PHN. TTs +6,22 and +5,23 at times showed a periodic waveform in addition to the PHN bursts of noise. This periodic waveform may have been associated with (stimulated by) the PHN. All 4 of these TTs appear clean when the HV is off, but I still need to look at this more. 2. Apply 30 Volts to just LAR5S. The Cal HV display shows no current draw on the supply. In the South Filter Box put the Fluke VOM across the filter capacitor for this channel and it shows 30 Volts. 3. In the South Filter Box put a 100 k Ohm resistor between ground and the Calorimeter feedthrough end of the filters 10 k Ohm output resistor. With only the LAR5S supply turned on and seeding out 30 Volts the Cal HV display shows a 30 uA current draw on this supply and the Fluke VOM shows the expected Voltages at the various points in the filter circuit. 4. In the South Filter Box ground the supply end of its 1 Meg Ohm input resistor. With only the LAR5N supply tuned on and sending out 30 Volts the Cal HV display shows no current draw on this supply and the Fluke VOM shows no voltage and any point in the South Filter Box. 5. Working from the South Filter Box, use the TDR to look at the HV cable going to the Calorimeter feedthrough. On the problematic channel the TDR shows an up going trace at at point that should be near the end of the HV cable inside the Calorimeter. Looking at other HV cables the trace goes down at a point that should be near the end of the cable. Tentative conclusions from Wednesday's tests: - There are no problems in the South Filter Box itself. - There appears to be a break in the connection from the South Filter Box to the problematic Calorimeter element. There is some indication that the break is near or at the end of the cable inside the cryostat at the point where it connects to the wiring that fans-out the HV to the ?5? boards in this calorimeter element. - The problem is not just that a clip has become disconnected from only one of the boards in this calorimeter element. Rather the South feed is disconnected from all of the boards in this Calorimeter element. Thursday 25-June-09 Worked on the Purple Haze Noise with Marvin and Dean. 1. Send a note to Jerry Blazey asking about the HV wiring inside the Central Calorimeter. - He did not immediately recall any details about the HV cabling inside CC and he recommended that I contact Alan Bross. He also thinks that Terry Huering and Sarah Durston did the inside HV wiring. Email notes were sent to Alan, Sarah and Terry asking when we could call them to learn about the inside HV cables. So far (6PM) no replies. 2. The North Filter Box has been opened and the filter elements associated with the LAR5N supply identified. These filter elements were check with the LAR5N supply set at 30 Volts. Checks were done both with and without a 100 k Ohm load on the output of this filter. Everything looks OK with this filter. Both voltage and current measurements look good. 3. Calibrate the TDR to a 118 foot piece of Reynolds HV cable. The velocity factor is about 0.66 which is believable. Use the now calibrated TDR with a better set of clip leads to look into the Calorimeter along the HV lead of the problem channel and other nearby channels. We also looked in along HV cables from the North Filter Box, i.e. the cable from supply LAR5N and other nearby cables. Pictures were taken of the TDR screen to record what was seen. 4. Checked for photographs of the assembly of the Central Calorimeter with Fermi Visual Media Services. No photos were found with enough detail to help us so far. Tentative conclusions from Thursday's tests: - There are no problems in the North Filter Box. Note that we did not think that there were any problem with the North HV feed to the problematic calorimeter element but the North Filter was checked anyway just for completeness and to eliminate any chance of a North-South mix up. - Looking in with the TDR, all cables look the same except for the South feed to the problematic Calorimeter element. The reflections from this feed differ in two ways: - The discontinuity on this feed starts about a foot or two nearer the feedthrough than it does on the good HV feeds. - The TDR shows the discontinuity on the PHN causing feed going up to a higher Z indicating an open. The TDR shows the discontinuity on the OK channels going down to a lower Z indication the lumped C at the end of the line. - The current suspicion is that the break is at the point where the Reynolds HV cable connects to the "pig tails" that run into the EM Calorimeter module. We are trying to contact someone who knows/remembers how this connection was made. Index of the TDR and Filter Box pictures taken on Thursday: IMG_0004.jpg Looking into the Calorimeter on the HV feed from LAR5S, i.e. the PHN causing feed. 200 mV/Div 5 ft/Div IMG_0005.jpg Same as IMG_0004.jpg but 2 ft/Div IMG_0006.jpg Looking into the Calorimeter on a random "good" HV feed from the South Filter Box 200 mV/Div 2 ft/Div IMG_0007.jpg This shows the ground ring and ground resistor on the Reynolds connector on the cable coming from the HV supplies into the Filter Box. IMG_0008.jpg Same as IMG_0007.jpg but not over exposed IMG_0009.jpg Close up view of the Filter Box insides. IMG_0010.jpg Wider view of the Filter Box insides. IMG_0011.jpg Looking into the Calorimeter on a random "good" HV feed from the North Filter Box 200 mV/Div 2 ft/Div IMG_0012.jpg Same as IMG_0011.jpg but 5 ft/Div IMG_0013.jpg Looking into the Calorimeter on the HV channel from supply LAR5N, i.e. the North end of the Calorimeter Element that has the PHN making South feed from supply LAR5S 200 mV/Div 5 ft/Div IMG_0014.jpg Same as IMG_0013.jpg but 2 ft/Div Thursday night we talked with John Najdzion. He did a lot of work on the module internal assembly at BNL and in the clean room doing CC installation and provided a lot of information about the HV wiring. He also found for us some very useful pictures of the CC showing parts of the HV wiring inside the cryostat. He provided descriptions of the components used in the HV wiring and he things that some spares of these components are still stored at D-Zero. Friday 26-June-09 Worked on the Purple Haze Noise with Marvin and Dean. Friday morning we talked with Sarah Johnson (Durston), who was the grad student from Rochester that was in charge of the Central Calorimeter internal HV wiring. She filled in a lot of information about this cabling: - The detector end of the 8 wire Reynolds cables from the HV feedthrough are terminated in pins that plug into the white terminal blocks. There are no Reynolds connectors in CC like there are in EC. - A given wire from a Reynolds cable goes directly to a white terminal block and then via pig tail wires into a Calorimeter Module. There is no intermediate set of connections. - The pins are both crimped and soldered to the Reynolds cables from the feedthrough and to the pig tail wires going into the detector modules. - Both the pins and the metal cylinders in the white terminal blocks are Gold plated. - She does not remember how the pins are held into (i.e. electrically connected to) the metal cylinders in the white terminal blocks. - She does not think that there were any photos taken of the final assembly and cabling process of CC. - Sarah's description of the internal HV cabling matches what John Najdzion told us Thursday night. We had a meeting Friday morning to discuss what we have learned and how to go forward with this work. - Compared to many noise investigations (which often have confusing ambiguous results) all the tests this week have been very clear and self consistent. It has been very helpful to make tests from the Filter Boxes. Everything points to an open circuit in the HV feed to the South end of this Calorimeter element. This open circuit is most likely right at the point where the incoming HV wire connects to the 2 pig tail wires that run inside the module to clip onto both sides of the boards at the South end of this module. The open circuit is such that there is no connection to either of the pig tail wires running into the module. - Recommendations: 1. We need to learn more about the components used in this cabling. Dean is going to talk with Bob McCarthy who was in charge of the CC assembly work in the clean room. John Najdzion thinks that we still have spare components from this HV cabling (white terminal blocks and such) stored someplace at D-Zero. We will ask John and Pete Simon to try and find them. 2. Continue thinking about this for the next two weeks. 3. If by two weeks from now there is no new information that argues against it then we would like to power this module from the North at full voltage and ground the South connection to this module. This will cause current to flow across (spark across) the open circuit and may "weld" things back together. This is essentially what we believe fixed the Purple Haze Noise problem a couple of years ago. What we hope to see is that after sparking for a day or two that the circuit welds itself and we start to see stable current draw from the supply feeding the North side. We would probably let it spark for up to a week or two before giving up. It's impossible to judge the likelihood of success of this experiment but there is very little chance of causing more damage. Water Leak in MCH-1 The water leak behind M123-M124 in the manifold appears to have been fixed during the power outage this past Monday as planned. Things look closed up and running OK. All external fans are running and the back of M123 is 90.7 degrees. Some kind person re-arranged all the icons on the TFW TCC. I assume this was done during the cold start. It's a pc so folks do whatever they want to. The first of the Wiener supplies saw given to Mike Matulik for the diode upgrade. I gave him the spare one from the cabinet. He will then do the rest starting with ADF Crate A. I need to write a second note to the L1 Cal folks to remind them of the up coming supply swaps. I can not bring the new feedthrough card back to MSU this trip because it is still being pressure tested. Need to bring a red probe for the Fluke VOM - ours was lost during the PHN work. Need an AA for the clock. 1956 1035 1987 AMD 1203 1247 1993 ------------------------------------------------------------------------------ DATE: 10:12-JUNE-2009 At: Fermi TOPICS: Trigger Tower work, shutdown Study the -3,31 HD noise problem. Note that this TT was Excluded and then we tried using it again and after one day it acted up a second time and was re-Excluded. I have never seen it look bad on the scope. For this study, Exclude all TT's except either -3,31 HD or -4,31 HD With a 1 GeV threshold (20 counts) triggering on HD only: with -3,31 plugged into -3,31 see an And-Or rate of about 600-700 Hz. with -4,31 plugged into -3,31 see an And-Or rate of about 350 Hz. with -4,31 plugged into -4,31 see an And-Or rate of about 85 Hz. with -3,31 plugged into -4,31 see an And-Or rate of about 150 Hz. With a 2 GeV threshold (24 counts) triggering on HD only: with -3,31 plugged into -4,31 see an And-Or rate of about 0.05 Hz. with -4,31 plugged into -4,31 see an And-Or rate of about 0.01 Hz. with -3,31 plugged into -3,31 see an And-Or rate of about 0.02 Hz. with -4,31 plugged into -3,31 see an And-Or rate of about 0.02 Hz. Conclusions: None of these rates/thresholds could cause a problem during normal Physics triggering, i.e. the noise is not happening during this test. 2 I was stupid and used an adjacent eta instead of an adjacent phi in the L1 Cal system which would have made it a little easier to understand the results of this study. Other single TT runs identified problems with TT's -6,22 EM and -6,26 HD. The cal element for these TT problems was identified. See control room log book entry: http://www-d0online.fnal.gov/crlw/Index.jsp? inquiry=L1Cal_excluded_towers_01109_em-6_22_hd-6_26 All external fans were running OK and things looked closed up OK. They will work on the header water leak behind the TFW during the 1st power outage of this shutdown. The TFW work is scheduled for Monday July 27 i.e right after the 2nd power outage. 1993 1824 1987 1977 2187 ------------------------------------------------------------------------------ DATE: 27:29-MAY-2009 At: Fermi TOPICS: Water Leak, Trigger Tower work, Walk Through, 22 Hz Rate Dips L2 I checked the currently excluded TT -3,31 HD and -6,26 HD a number of times. I never saw any noise on either of them that would cause them to be Excluded. Darian verified that these two TTs are still currently Excluded. I looked at -5,29 HD. This is the TT that Selcuk said was bad in the Pulser run that he took recently. It did look bad on the scope. I took pictures and showed them to Dean. It looked non-terminated, i.e. the blue cable was plugged into the patch panel OK but the PFC cable was either not plugged into the patch panel or not plugged into the transition module (or the transition module was not plugged in). The Patch Panel end looked OK. The Transition module end of the PFC cable "clicked" as I pushed on it. It went to a slot in the crate that had had a replacement LVDS cable installed. I assume that the PFC cable was disturbed when the LVDS was replaced. Now the scope pictures of real energy deposits look OK. The MCH-1 water leak is in the threaded section of an all metal header that feeds the TFW. It is just dripping from the threads once every 5 seconds or so. It is now being colleded and pumped out. The intent is to run this way until the shutdown. We assume that it will not blow out and become a big leak in the immediate future. Rate Dips - "L2 Global not sending all pending L2 Decisions to the L2TFW I think that it is easy to see this problem from the Control Room displays. You do not need to do anything special to see it. During the low L1 rate running before a new store just watch the "L1 Awaiting L2" number on the L2 display. It is normally 0 and sometimes 1. When it becomes normally 1 and sometimes 2 then you are in the situation that has been called, "Rate Dips" or from the TFW point of view "L2 Global not sending all pending L2 Decisions to the L2TFW". I watched the L2 monitor display. Starting at about 19:14 Wednesday evening during P bar injection while the L1 rate was about 20 to 30 Hz I saw the following: L1 Awaiting L2 was 1 All L2 Global inputs said 1/16 except L2 MUF which said 0/16 All L2 MUF inputs said 1/16 except "SLIC 67 ABC NE" said 0/16 All inputs to: L2 MUC, L2 Cal, L2 PS, and L2 CTT said 0/16 To me this indicates that L2 Global had an event that it could not process because it was awaiting data from L2 MUF. In turn L2 MUF was awaiting data from "SLIC 67 ABC NE". The other 4 L2 Pre-Processors had completed their last event and sent their data to L2 Global. While this condition existed I told the TFW to stop issuing L1 Accepts for about 30 seconds. I did this to verify that with the Trig/DAQ system in a static state that this condition persisted and that the L2 monitoring screen continued to look as described above. They did. I also watched the stores go in Thursday evening and Friday evening but the system never got into this situation during either of those shot setups. I would be nice of L2 could log their monitor information. Walk through and all external fans are running OK and things are closed up. 91.9 F 1203 1268 1289 1866 1956 1046 1035 ------------------------------------------------------------------------------ DATE: 13:15-MAY-2009 At: Fermi TOPICS: Trigger Tower work, walk through Checked again with the scope TTs: -3,31 HD and -6,26 HD. On Wednesday I could see -6,26 HD making the type of noise that it did a couple of weeks ago (the noise that I posted scope pictures on the web). I still have never seen -3,31 HD make noise. As far as I know we still need single TT runs on both of these noisy TTs when they are making noise. All external fans are running and things are closed up OK. The M124 rear temperature is 91.9 F. There are still reposts of water under the MCH-1 floor. I checked under a tile in front of the TFW and it looked dry. Need to bring meter batteries on the next trip. Replaced the DAQ-480 power strip. ------------------------------------------------------------------------------ DATE: 29-APR:1-MAY-2009 At: Fermi TOPICS: Bit-3 tests for a new Spare, Trigger Tower work, Status box - monitor update test So far the tests of the "New Spare L1 Cal Trig Bit-3" PCI end at MSU look OK. It has passed 10** loops of random register test directly into the test ADF-2 crate. Still need to get a "New Spare L1 Cal Trig Bit-3" VME end. The inventory list of Bit-3 cards is at: www.pa.msu.edu/hep/d0/ftp/l1/framework/logs/Bit3_Inventory.txt Currently the story is that we will not be receiving a spare VME end of a Bit-3 because there really is not one just sitting around somewhere. Since the flood last weekend Selcuk has had to Exclude two Trigger Towers to keep the L1 Cal Trig rates stable. He was asked to exclude them by the Run Coordinators. Note that during the flood that it also tripped off the Cal HV racks and that it is still 100% humidity in MCH-1 with the Cal HV supplies. So either the restart of the Cal HV or the humidity could be making noise problems. The two Excluded TTs are: -3,31 HD and -6,26 HD. I've checked both of them on the scope. Both have good differential physics signals. -3,31 HD looks clean. -6,26 HD has significant 132 nsec sync noise on it, but -7,26 HD and -8,26 HD have more 132 nsec sync noise so I do not think that that is the problem. OK I have finally seen the "noise" problem on -6,26 HD. It is an oscillation with a period of about 6 to 7 usec and an amplitude of about +- 5 or 10 GeV. It starts and stops very suddenly. Walter checked the BLS card during the Friday late afternoon access but there was nothing obvious wrong. I have scope pictures to show to Dean. Talked with Geoff about the monitoring displays and do they update when event data is not flowing. Agreed that I will make a box (like we had) so that we can artificially assert Busy signals and test this. Racks are now closed up and all fans inside and out were checked. There is still water under the floor and the MCH-1 Liebert is still de-humidifying. 1993, 1863, 1866, 3082 ------------------------------------------------------------------------------ DATE: 25:26-APR-2009 At: MSU TOPICS: Water leak onto the TFW, Geographic Section 0x65 stuck L2_Busy I was called at about 9 AM Saturday morning because there had been a water leak on MCH-2 that resulted in a lot of water coming down into MCH-1 and a lot of water ran down into the TFW. It was reported that water came down the SCL cables and into the back of M124. I was told that they tried to divert the water away from the TFW and to keep it running but there was enough water in one of the racks so that it eventually tripped off. After stopping the leak and cleaning up the water they started and ran just the fans for a while and then in the early afternoon they powered up the TFW. Folks in the control room took care of cold starting the system and they got everything running OK. Saturday afternoon while checking TrgMon I noticed that 0x65 was L2 Busy and there were notes in the control room log book about 0x65 "going" L2 Busy. Saturday about 8PM I was paged because folks were working on the 0x65 L2 Busy problem and did not yet have a solution. I was told that: they had swapped SCL cables that that the result of that test pointed to a problem in the crate. I was also told that: the SBC was seeing the VRBC assert the Slave Ready line. Sunday about 1AM I was paged. 0x65 was no longer showing 100% L2 Busy. It was showing some random percentage L2 Busy. It was also discovered that with the SCL Status cable unplugged from crate 0x65 that it still showed some random percentage L2 Busy. Geoff was in the control room and working with him he unplugged the SCL Status cable for 0x65 from the M124 SCL Hub-End and tried to dry the connector. After plugging it back on to the Status Concentrator module the 0x65 L2 Busy was back to working correctly. Geoff removed the back door from M124 so that more air would flow through the Status Concentrator cards and the system was run for some hours with the M124 back door off. Over the next few days there were ------------------------------------------------------------------------------ DATE: 14:15-APR-2009 At: Fermi TOPICS: Tutorial talk, cold electronics meeting, MCH-1 check I gave the TFW tutorial talk on Tuesday. Cold CMOS electroincs meeting on the 14th floor all Wednesday morning. Things look closed up in MCH-1 all visible fans and temperatures are normal. ADF-2 SN# C1 going back to MSU. Fully running DAQ-480 in NuMi with preamps on and file transfer running OK. ------------------------------------------------------------------------------ DATE: 1:3-APR-2009 At: Fermi TOPICS: Main Injector Protons, L1 Cal Future Support Meeting, Bit-3 Inventory, Walk Through MCH-1 MI typically does 8.0E12 to Pbar and 31.6 E12 to NuMI NuMI total of 7e20 POT was surpassed on Tuesday DAQ-480 running in NuMi with both crates doing DMA. It is nice and fast. Everything is put together and ready to run. The following is a list of all of the Bit-3 stuff that I can find in or cupboards at D-Zero. I dug through everything. There is a PCI end of a 618 Bit-3 in the Spare TCC. PCI end of a 618 "Broken L1 Cal" S/N 698977 85851325 Rev E April 2004 on its fiber optic transceiver PCI end of a 618 inside the Spare TCC stored at D-Zero S/N 623605 85851325 Rev D July 2003 on its fiber optic transceiver PCI end of a 617 copper Bit-3 Label information: W.O. 1454 4798 P/N 85221511 A 186293 (on a bar code label) VME end of a 618 "Spare MSU Fiber Optic Bit-3 at D0" It has a 3" x 4" mezzanine circuit board by its J2 connector. It has a wire wrap Grant Jumper setup for running ADF-2. There is no date code on its fiber optic transceiver. Date codes on ICs: 9848, 9902, 9903 Label information on its J2 connector: 85853432 Rev C 197412 (on a bar code label) VME end of a 617 copper Bit-3 "L1 Trigger Spare" Date codes on ICs: 9824, 9836, 9841 Label information on its J2 connector: 85154554 Rev A 187882 (on a bar code label) There is also a spare Bit-3 copper cable and a spare Bit-3 fiber optic cable in our locked cabinet. For reference the clearly labeled "Columbia" 9U Bit-3 VME end in the Sidewalk test stand is: S/N 629279 85853656 Rev A August 2003 on its fiber optic transceiver. Sometime when it is OK to do so (e.g. this summer's shutdown) I will pull the Bit-3 VME card out of the Comm/Control Crate and verify its information. Received from Selcuk the PCI end for the "NEW L1 Cal Trig Spare Bit-3". So far we have only the PCI end for this New Spare. It is: PCI end of a 618 "New Spare L1 Cal Trig" S/N 584675 85851095 Rev B October 2002 on its fiber optic transceiver I will bring back to MSU both ends of the "Broken L1 Cal Trig" Bit-3 and the PCI end of the "New Spare L1 Cal Trig" Bit-3. Two walk throughs of MCH-1 and things look OK. All outside visible fans are running. 92.8 degrees at the back top of M124. ------------------------------------------------------------------------------ DATE: 18:20-MAR-2009 At: Fermi TOPICS: Walk through of MCH-1 things look OK. Check the M124 inside fans. From the Friday Ops meeting: 2010 running is definately on and they want to run in 2011 and are asking money to do that. The summer shutdown start is now fixed at Juen 15th. 34-D and 37-D-Maestro are the two spare ADF-2 cards that have been at D-Zero for the past couple of months. On this trip I returned 5-C to swap for 34-D. So now 5-C and 37-D-Maestro are at D-Zero. The ntp servers at Fermilab are: 131.225.8.200 and 131.225.17.200. The current frequencies shown on the Tektronics frequency counter in M100 are: at 150 GeV 53.10368 MHz and at 980 GeV 53.10468 MHz. See the 13-Mar-2009 log book entry. DAQ-480 is now running in NuMi with the dual crate setup and the new wire order and site dependent file. With one DAM and one PIO it is running at the event per 2.2 sec rate. ------------------------------------------------------------------------------ DATE: 13-Mar-09 At: MSU action at Fermi TOPICS: Switched Master Clock PCC module to Free-Run Power was lost in the D0 Service Building at about 8:20 AM and thus the Store was lost. Because of no power in the D0 Service Bldg we had no RF or Sync signals from the Tevatron. This caused the D-Zero Master Clock to run down to a frequency of about 53.09823 MHz. George called and asked about moving the Master Clock's PCC module from "Normal" mode to "Free-Run" mode. Doing that put the clock at 53.10456 Mhz. ------------------------------------------------------------------------------ DATE: 3:5-Mar-09 At: Fermi TOPICS: D-Zero, NuMi Tunnel, and CalOp meeting Walk through of our equipment in MCH-1 looks OK. Need a battery for the thermometer on M124 top crate. All visible fans look OK. Installed the dual VME crate DAQ system in NuMi tunnel. Top PreAmp card file was stacked. ------------------------------------------------------------------------------ DATE: 17:20-Feb-2009 At: Fermi TOPICS: Get L1 Cal Trig Running, Wiener Power Supplies L1 Cal Trig Trouble - What happened 1. Sometime, probably late last week, the rear fan in M101 failed. During Run II there have been previous problems with the two fans in M101. Previous M101 Fan problems on: DATE: 4:6-OCT-2006 and DATE: 26:28-JUNE-2007 in our log book. Brief Version: DATE: 4:6-OCT-2006 The "blower running" light was OFF, Fuse opened, motors both turned freely, fuse replaced, motor currents checked and both read fine, checked after some hours of running and the motors were not hot DATE: 26:28-JUNE-2007 The "blower running" light was OFF, Fuse opened, motors both turned freely, fuse replaced, change to running only the rear fan, motor current was fine, motor was not hot after running hours 16-FEB-2009 The "blower running" light was ON, the fuse was OK, the rear fan was turning very slowly, its motor was quite hot, Tuesday morning I quickly switched from running just the rear fan to running just the front fan, checked after hours of running and the front motor is not hot, the rear fan is hard to turn by hand, it clearly has a bad bearing or two --> Bring a second spare fan to D-Zero. M101 fan failure has a known signature of crate 0x10 SBC problems. Recall that nothing was changed in the 0x10 readout crate between the old Run IIA and the new Run IIB L1 Cal Trigs - so we have a long history of this crate's behavior. We did not do a very good job communicating about the 0x10 SBC problems. When I was contacted Sunday evening about power cycling 0x10 I should have pushed on folks to learn why they wanted to do this. We should have been contacted before folks decided to change the SBC in "our" crate. Monday, during the day, none of us talked about the 0x10 SBC problems because we were looking at the bigger more immediate and obvious problems in ADF and TAB/GAB crates. 2. Starting sometime Sunday morning, something failed in the communications path that lets L1Cal TCC talk to the cards in the L1 Cal Trig. The first clear indication of this problem was a Post-Write Check Error in a ADF crate while TCC was doing an active pedestal correction at 10:07 AM. As designed, TCC by itself managed to work around these first few known communication errors by rewriting register information when necessary. Sometime early Monday morning this communications path problem resulted in TCC overwriting the "module IDs" stored in the TAB and GAB cards. This made the readout data from L1 Cal Trig un-unpackable and caused other problems. A couple of times Monday afternoon this communication problem resulted in TCC overwriting control registers in two different ADF crates that caused the high speed ADCs on most of the cards in these crates to shutdown and thus the current drawn from the 3.3V bus fell to 15% of its normal value and that caused an SES alarm. Sometime Monday before 19:39:13 this communication problem resulted in TCC overwriting the on-board configuration memory in the VRBC card in the readout crate. From 19:39:13 on Monday evening there was no possibility that L1 Cal Trig readout would work until its VRBC's on-board configuration memory was reloaded. Note, there were two separate problems. There is no possibility that the fan problem in M101 could cause the communication problem. The communication problem could have been caused by: - a problem in the TCC computer itself - a Bit-3 problem at either the pci end inside the TCC or at the VME end in the rack M108 Communication Crate - a problem with the VME bus in the Communication Crate e.g. a card in Communications Crate with a hung address or control line For a try at a fix we replaced both ends of the Bit-3. It will take time to "prove" that this actually fixed the communication problem. The net action to get the system running was: - Switch M101 from its broken rear fan to it good but previously disconnected front fan. - Reload the VRBC's on-board configuration memory. - Replace both ends of the Bit-3 link (no other cards were changed). Examples of erros that happened when TCC tried to write to control registers in the ADF-2 cards. 15-Feb-2009 10:07:11 Post-Write Check Error Master#0/Slave#3/Slot# 2/Chip# 0/Reg#512 Read = 0x0008 / Last Written = 0x0009 Second Register Write Successful used by Track_Ped 15-Feb-2009 17:21:00 Post-Write Check Error Master#0/Slave#3/Slot#18/Chip# 1/Reg#512 Read = 0x0000 / Last Written = 0x0009 Second Register Write Successful used by Track_Ped 16-Feb-2009 11:58:13 Pre-Write Check Error Master#0/Slave#3/Slot#14/Chip# 0/Reg#512 Read = 0x0008 / Last Written = 0x0009 16-Feb-2009 15:18:36 Pre-Write Check Error Master#0/Slave#0/Slot#11/Chip# 0/Reg# 1 Read = 0x0040 / Last Written = 0x0050 The VRBC quit reading back 0x10 for the crate ID with the initialization at 16-Feb-2009 21:17. > Card ID Read from VRBC is 0x0010 Serial Number 0x17 %%16-Feb-2009 16:50:51 > Card ID Read from VRBC is 0x0010 Serial Number 0x17 %%16-Feb-2009 19:39:13 > Card ID Read from VRBC is 0x0017 Serial Number 0x17 %%16-Feb-2009 21:18:16 > Card ID Read from VRBC is 0x0017 Serial Number 0x17 %%16-Feb-2009 21:28:47 Talked with Mike Matulik about the Wiener power supply modification. It sounds good for this summer shutdown. We also talked about, "where is all the configuration information stored". Is there a little card with a serial prom in the bin. He is going to talk with the Wiener folks and learn the details. ------------------------------------------------------------------------------ DATE: 16-Feb-2009 At: MSU TOPICS: Phone calls from the Control Room about L1 Cal Trig During the day we were called because plots e.g. MEt looked wrong and the readout data was screwed up. Called again because the 3.3V draw in 2 crates had gone from 35 Amps down to 6 Amps. In one case they Initialized the L1 Cal Trig and got running again. In the other case they power cycled everything and got running again. Paged at 10:30 PM and again a little before midnight because they had changed the 0x10 SBC and then could not make anything run. Many calls during most of Tuesday early AM trying to make the readout crate operate - trying such things as disabling all the VRB channels. ------------------------------------------------------------------------------ DATE: 15-Feb-2009 At: MSU TOPICS: Crate 0x10 (L1 Cal Trig Readout) Power Cycle Paged at about 7PM by the DAQ Shifter because the Control Room wanted to Power Cycle the L1 Cal Trig Readout Crate. Specifically they wanted to know how to start of the crate again after the power cycle. I directed them to the written L1 Cal Trig Power Up procedure in the notebook in the Control Room. I ask, what's wrong, why do you want to power cycle this crate. Answer, Bill told me to. I specifically ask, is the blower running on M101. ------------------------------------------------------------------------------ DATE: 4:6-Feb-2009 At: Fermi TOPICS: More work on noisy Trigger Towers, Update the single TT run procedure, work in NuMI Tunnel, boards to 14th floor, meetings, Return the spare Bit-3 to cabinet Walk through MCH1. Things look OK and racks are closed up with all visible fans running. The temperature at the top of M124 is 93 degrees which is up a little. Next trip, replace that battery, and check the internal fans in the back of M124. More work on noisy TTs. -9,22 HD the noise is from Tower 1 Depth 10 Dean installed a special summer with Tower 1 Depth 10 resistor cut +6,25 EM the noise is from Tower 2 Depth 2 +11,24 HD the noise is from Tower 1 Depth 7 and a little from additional cells Dean installed a special summer with Tower 1 Depth 7 resistor cut +10,24 HD the noise is from Tower 2 Depth 7 and a little from additional cells. Dean installed a special summer with Tower 2 Depth 7 resistor cut Note that the +11,24 HD and the +10,24 HD noise is synchronous, i.e. both TTs are picking up the same "sparking" +6,7 EM the noise is from Tower 3 Depth 7 PFC-16 to Paula. 2 B, 3 B, 5 B, 6 B, 7 A, 9 A, 11 D, 12 D, 14 D, 16 D, 18 A, 19 A, 21 B, 22 B, 24 B, 26 D, 29 C, 31 A, 32 A and the Narrow Gaussian model card 25 D Friday meeting with Nina, Marcus, and Walter about feedthrough pcb design. Return the Spare L1 Cal Trig Bit-3 (PCI & VME) to our spares cabinet at D-Zero. Update the procedure for running the Single Trigger Tower to use the new official location of tablib. Running the Single Trigger Tower "Noise Trigger" ------------------------------------------------ Exclude all TTs except for the one that you want to trigger on: --------------------------------------------------------------- 1) On d0tcc3 you will find Test_Exclude_Most_Trigger_Towers.msg in /tcc/L1Cal_IIb_Work/Config/. 2) You can edit this file to your needs and aim at the tower(s) of your liking. 3) From the GUI main menu select "TCC Com File" and enter the name of the file in the text box: /tcc/L1Cal_IIb_Work/Config/Test_Exclude_Most_Trigger_Towers.msg or click your way to it using "Locate TCC ComFile" 4) click on "Self Msg ComFile" to tell the TCS engine to ingest and execute this file. 5) If you need to switch to a different tower(s), repeat steps 1-4 6) When you are all done, *do not forget* to re-initialize the L1Cal system Setup the TAB & GAB for "Noise Trigger": ---------------------------------------- 1. On d0tcc3 cd ~d0cal/l1cal2b/tablib/run 2. Run the program: ./bin/setup_noise --help (This will list the usage of this program) 3. Run the program: ./bin/setup_noise --enable NUM_TOWER: Number of towers in this trigger ADC_COUNT: 16+4*Threshold (per tower in GeV) (16 = 8+8 pedestals for EM+HAD and 4 counts/GeV) --enable means to actually do it 4. The output of this trigger is on And-Or Term 192. 5. To go back to regular running, initialize the system with the L1Cal_IIb_GUI. TAKER Trigger Configuration File: --------------------------------- Use the Trigger Configuration File: /commissioning/l1cal/l1cal2b/trial/l1cal2b_noise_wcal_wsmt_1.0.xml ------------------------------------------------------------------------------ DATE: 21:23-Jan-2009 At: Fermi TOPICS: More work on the "warm" TT region, check MCH-1, DAQ 96 in NuMI Tunnel, CalOps meeting Walk through MCH1. Things look OK and racks are closed up with all visible fans running. It sure looks like a lot of water came down 3 or 4 weeks ago when they had the leak on the 2nd floor. CalOps meeting: discussed the +6,25 EM blow out last Sunday, not checked on scope on Monday, no single TT run on Monday during a day of ZB, who is now keeping records of things like what resistors are cut. No clear answers. More work on looking at the TT noise in the warm region. -10,26 HD and -10,27 HD are the worst. - The bursts of noise in both of these channels go away when you stop the L1 Accepts. If you trigger the scope on the L1 Accepts then you do not see these bursts stable on the scope. During the busts the amplitude is about 2x what it normally is and it is about 180 nsec period. - If you trigger the scope on the Begin of Turn marker then the 132 nsec sync noise in -10,26 HD looks very very unstable. The 132 nsec sync noise in -10,27 HD is also unstabe but not as unstable as -10,26 HD. I used -9,12 HD as a reference of "stable" 132 nsec sync noise. Phone numbers down in the NuMI tunnel: Detector Hall, Experiment 875: 5875 Detector Platform, Center: 4578 Detector Platform, North: 5578 Detector Hall (Controls and PS) : 6639 Your best bets are: 5875 and 6639. Try a couple of times before giving up. DAQ 96 480 is running in NuMI tunnel. Pedestal noise runs with the Transrex both ON and OFF. Yes you can see it - duh. ------------------------------------------------------------------------------ DATE: 7:9-Jan-09 At: Fermi TOPICS: