D-Zero Hall Log Book for 2007 ------------------------------- The most recent entries are near the beginning of this file. This file begins in January 2007. This file contains both Trigger Framework and L1 Calorimeter Trigger entries. Earlier D-Zero Hall Log Books are on the web in one of the following directories: http://www.pa.msu.edu/hep/d0/ftp/l1/framework/logs/ http://www.pa.msu.edu/hep/d0/ftp/run1/l1/inventory_logs/ ------------------------------------------------------------------------------ DATE: 31-DEC-2007 At: MSU TOPICS: Recent 0x20 out of sync errors During the recent ZB running while the Tevatron is being repaired there were a lot of L2 Global L2_Decision missmatch errors. These are 3 clips from the Shift Summaries and Control Room Log Book. DAQ Shift Summary Date Created: Thursday, December 27, 2007 3:49:42 PM CST Date Saved: Thursday, December 27, 2007 3:53:44 PM CST Category - Topic - sequence number: DAQ/Log - DAQ_Log - 673062 Operator(s): Hang Yin, George Golovanov Keyword(s): :DAQ:END_SHIFT: Shift: DAY Time: 3:50:05 PM CST Dec 27, 2007 Operator: Georgy and Hang Summary: L2GBL problems at the beginning of the shift. After resetting the trigger Framework it's gone. We were taking cosmic data: Run #238859 total 1190 events, Run #238860 total 50073 events. We are running zero_bias without SMT. Control Room Log Book Entries Sunday, December 30, 2007 18:14:34 CST: CAPTAIN/Log: 673426: Bill Lee I arrived to the control room and found that x20 was going out of sync on a regular basis. After about 4 minutes of x20 out of syncs, I stopped the run, issued a initl1fw, and restarted the run. This did NOT help the situation. After several more minutes of x20 out of syncs, the DAQ shifter did another initl1fw. There were then about 5 more x20 out of syncs and then the problem went away. Sunday, December 30, 2007 18:35:20 CST: CAPTAIN/Log: 673440: Bill Lee We have another period of x20 out of syncs. They were occuring at about one or two per minute, so not as frequent as before. I have tried the initl1fw procedure again. We will see what happens. So far after two minutes, we have not had a problem. **Comment by billl on Sunday, December 30, 2007 6:46:03 PM CST so about 13 minutes until the next x20 lost sync ------------------------------------------------------------------------------ DATE: 14-DEC-2007 At: MSU TOPICS: Switch to Trics V11.2.C Switch to new Trics version 11.2.C which will provide the same protection to M122-Middle as for M122-Top. M122-Top holds the Level 0 Per bunch scalers (to measure the delivered luminosity) and is read about twice every 5 seconds, once for general monitoring and once for luminosity snapshot. M122-Middle holds the exposure group per bunch scalers (part of the measurement of the luminosity recorded by each specific trigger), and is read once every 5 seconds for general monitoring and once per minute (and for special transitions) for luminosity snapshot. The Autostart version has also been updated. Updated /hep/d0/ftp/l1/framework/logs/m122top_readout_problem_list.txt ------------------------------------------------------------------------------ DATE: 8-DEC-2007 At: in the car TOPICS: Hung M122 Middle Crate VME Cycle I was paged Saturday evening because at TFW Initialize time right before a new Store TCC discovered that the M122 Middle crate, i.e. the Exposure Group per Bunch Scaler crate was hung. The control room was able to push the M122 Middle crate VME reset button and then Initialize the TFW again and things ran OK. Currently TFW TCC is only providing protection for M122 Top crate but Philippe will now start work to add protection of M122 Middle crate. Note below that they say that the PBS card with its VME LED stuck on was SM-41. Our inventory file shows that SM-41 is in slot #11 of the M122 Middle crate. When M122 Top crate hangs it is also probably most often the card in slot #11 is the one that hangs but cards in other slots have also hung in M122 Top. From the Control Room Log Book: Saturday, December 8, 2007 19:21:19 CST: CAPTAIN/Log: 669082: Michael Kirby Problems with Initializing the Trigger Framework. Contacting Dan Edmunds, or at least trying to. **Comment by Kirby on Saturday, December 8, 2007 7:33:49 PM CST Problem solved with the help of Dan. Just needed to reset the SM41 card in the L1TFW. Saturday, December 8, 2007 19:49:47 CST: DAQ/Log: 669088: Shannon Zelitch During shot setup, in trying to reinitialize the level1 trigger framework we recieved the following error: d0ol112|/home/d0run>initl1fw Reading parameters file: /mnt/online/data/coor/coor.params. Reconnecting. WAIT *bad*Fail: [init -> l1dnl] *bad*Fail: [L2_Global_Obeyed -> l1dnl] *bad*Download completed ... but with ERRORS. Paged Dan Edmunds (who was in a car en route to MSU). Problems.txt had a description of this problem. : I was told to go to the top crate in rack 122 in the MCH and check if the top LED of one of the cards in the crate was stuck on. I followed the expert's instructions and pressed a recessed button on the card in slot 1. This cleared the error and I was able to issue the initl1fw command successfully. For us, the problem was that the top LED on one of the cards (SM-41) in the second crate down in Rack 122 in MCH was on. Dan Edmunds said that we could try the same solution, pushing the topmost recessed button on card in slot 1 in the effected crate. In doing so, the LED on card SM-41 went out, and we were able to reinitialize the L1TW without errors. Currently running global_CMT-15.92 with scraping prescale. ------------------------------------------------------------------------------ DATE: 8-DEC-2007 At: Fermi TOPICS: Look at TT Signals, note about how to check the L1 Fired Mask in L2 Global, work on making a FOM++ card. Check some TT signals: +8,8 EM now looks fine. Both EM and HD are fine for +8,8 See log book entry from 18:20-OCT-2007 While looking around +8,8 I looked at +8,6 HD. For +8,6 the HD- signal looks to be about 1/2 the size of the HD+ signal. Duh this is also known - again see 18:20-OCT-2007 +11,31 EM The EM- signal is flat. With the Ohm meter I do see the output cap on the BLS hybrid. Wrote a note to Jim about the checks that L2 Global can make to verify that it is getting a good copy of the fiber optic data with the L1 Fired Mask. Work more on AONM-FOM-05B to convert it to an FOM++. See: www.pa.msu.edu/hep/d0/ftp/l1/framework/hardware/aonm/ l1_qualifier_signal_path.txt www.pa.msu.edu/hep/d0/ftp/l1/framework/hardware/aonm/ special_functions_fom.txt www.pa.msu.edu/hep/d0/ftp/l1/framework/hardware/the_card/ the_card_customization.txt Most of this work is covered in the 6:7-Nov-2003 log book entry. On the FOM++ which is SN# 05B I installed 120 Ohm Terminator Resistors at R146, R147, R148, R149, R150, R151. Pin #1 is at the top up side of the card. These are CTS 750-63-R120. No resistors were installed at R152, R153. At R154, R155 I installed a 56 Ohm Pull Down Resistor with its pin #1 at the top up. These are CTS 750-61-R56 Ohm. On the FOM++ I connected the hardware input L1 Qualifier to only FOM++ Main Array FPGA's #1 - #5 and #2 - #6. There are no jumper wires to FPGA's #3-#7 or #4-#8. I should have brought a small bottle of flux and a fine needle for cleaning out the via33's to install these wires. ------------------------------------------------------------------------------ DATE: 29,30-NOV-2007 At: Fermi TOPICS: Connect the Logic Analyzer to L2, Work on TTs, SCL Platform Cable Length, FOM++ I connected the Logic Analyzer so that it can watch some L2_Acpt, and L2_Rej signals coming out of the L2 TFW on their way to the SCL Hub-End. This was done by tapping into cables in the back of M122 and putting a "buffer" card right next to the tap so that there are no stubs on these lines. The buffer card is the 100314 output driver section of a WPA card from the Cal Trig. Putting on the tap connectors is very hard because there is very little slack in these cables. A problem for this week is that I tapped onto the cables for G.S. 31:16. This is a mistake because I was going after G.S. 0x20. So on the next trip I need to tap into the cables for G.S. 47:32. For now I have tied up the L2_Acpt and L2_Rej lines for G.S. 0x10 to the logic analyzer. They are plugged into the "D0" byte of the logic analyzer with L2_Rej in the bit of value 16 and the L2_Acpt in the bit of value 32. Recall that the Mismatch Flag from L2 Global is in logic analyzer byte "D0" in the bit of value 1. I still need some ZB running where I can turn off all of either the L2_Acpt or all of the L2_Rej to verify these connections. Also recall how the information is encoded at this point in the system: To send out a L2_Acpt to this G.S. both of these lines are asserted. To send out a L2_Rej to this G.S. only the L2_Rej line is asserted. So at this point in the system the 3 values that you see on these 2 lines are: 3 This is an L2_Acpt for this G.S. 1 This is an L2_Rej for this G.S. 0 This L2 Decision is not for this G.S. Recall that the rest of the decoding to give the conventional meaning to the L2_Acpt and L2_Rej signals on a given SCL link takes place in the SLF card in the SCL Hub-End. So the current setup of the Logic Analyzer inputs is: Recall the current setup of connections to the logic analyzer: Logic Analyzer Input Drew DeMux I.E. L2_Answers for Pod Signals Card Output L1 Specific Triggers --- ------- ----------- -------------------- A3 0:7 24:31 24:31 A2 0:7 16:23 16:23 A1 0:7 8:15 8:15 A0 0:7 0:7 0:7 C3 0:7 72:79 72:79 C2 0:7 64:71 64:71 C Clock_3 L2_Answer_Strobe D0 0 Mismatch Flag signal from L2 Global D0 4 L2_Rej for G.S. 0x10 D0 5 L2_Acpt for G.S. 0x10 I still need to confirm these L2 Acpt Rej connections. Check on some TT's that Selcuk noted as problems in a recent pulser run. +9,4 HD Both HD+ and HD- look dead on the scope. The BLS cable does not look shorted. Using the Ohm meter I can not "see" the HD Driver Hybrid on the BLS card. Next step, check the BLS card for missing HD Driver or bent pins on the HD Driver. The EM side of +9,4 HD looked fine on the scope. +10,4 HD Both HD+ and HD- look dead on the scope. The BLS cable does not look shorted. Using the Ohm meter I can not "see" the HD Driver Hybrid on the BLS card. Next step, check the BLS card for missing HD Driver or bent pins on the HD Driver. +10,4 EM The EM+ signal looks dead on the scope. The BLS cable does not look shorted. Using the Ohm meter I can not "see" the EM+ side of the Driver Hybrid on the BLS card. Next step, check the BLS card for a bent pin on the EM Driver. Note that +9,4 and +10,4 are adjacent BLS cards in rack PS-05 Top crate. They probably have a common BLS cable connector at the BLS end and the whole problem may just be that this connector is not well plugged in. -12,18 EM The EM- signal looks dead on the scope. The BLS cable does not look shorted. Using the Ohm meter I can "see" both the EM+ and the EM- sides of the Driver Hybrid on the BLS card. Next step, try replacing the EM Driver on the BLS card. The HD side of this TT looks OK on the scope. While working on +12,18 EM I scanned this block of TT signals and found another bad one, i.e. -10,18 EM -10,18 EM The EM- signal looks sometimes dead on the scope. Moving the BLS cable connector around in the patch panel connector makes this EM- signal come and go. The problem was a bad solder connection in the BLS cable connector. I fixed it and this channel is now OK. While scanning through near by TT signals I noticed that all the TTs (both EM and HD) in the blocks coming from BLS crates PS-11 and PS-13 middle crate had a lot of 60 Hz buzz on them. This is about 50 or 100 mV of buzz. PS-11 middle crate is: phi 31 eta +11, +12, ..., +19, +20 phi 30 eta +7, +8, ..., +13, +14 PS-13 middle crate is: phi 27 eta +11, +12, ..., +19, +20 phi 26 eta +7, +8, ..., +13, +14 Because of this I could not really check in detail TT +11,31 EM which was the last on on Selcuk's list. I did pass information about all of these TTs to Dean and company. Saturday Dec 1st news about these TTs. Dean did find as expected that it was a loose connector that was the problem with +9,4 and +10,4. Jadzia has noticed that there appears to be a 16 nsec difference between the two CFT Sequencer crates on the platform. A possible cause of this is a difference in SCL cable length. Petros and I measured these cables some years ago but I can not find this data in our log book. From the log book it appears that these SCL cables were installed in two batches: 4 in 25-July-2000 and 8 in 2-Feb-2001. I talked with and sent email to Mike Matulik and Jadzia. They are setup to make this measurement sometime when there is an access. The CFT Sequencers are 0x03 and 0x05. I gave the adaptor cable (TDR to SCL Cable) to Mike. Saturday Dec 1st news about these SCL Cable lengths. Mike and Jadzia measured the cables for these G.S. and found: Length of cable SCL003 is 154.00 feet with TDR set to Vp=0.83c Length of cable SCL005 is 166.88 feet with the same Vp setting. I began the work on AONM-FOM-05B to convert it to an FOM++. I did not finish this so I need to bring back the tools again on the next trip. ------------------------------------------------------------------------------ DATE: 25-OCT-2007 At: MSU TOPICS: Track_Ped started 25-Oct-2007 16:11 Track_Ped on L1Cal TCC started to automatically adjust the Trigger Tower Pedesdals every 2 minutes. This included the first store (#5657) after the shutdown. Reminder: We tested it before (and during?) the shutdown: starting 5-Oct-2007 Summary of operation: L1Cal TCC now automatically tracks and corrects the Zero Energy Response for the Transverse Energy of each Trigger Tower, at the output from the ADF cards, i.e. the input to the TAB Cards, to remain at the design value of 8 counts = 0 GeV. - For each Trigger Tower, the Zero Energy Response Drift is measured from the average of 1000 samples (note: now 1008 to cover an integral number of turns) of Live Crossing Energy. - If the measured drift is below a programmed threshold (0.5 counts) TCC immediately applies 50% of the correction needed to correct the measured drift. - If the measured drift is above the threshold, TCC does not make any correction and waits until the next sample to decide if this was a temporary spike or a true step change. At the next snapshot, and if this was a spike, do nothing and return to the 50% correction mode for future correction cycles.  If this is a step, do the full correction instantly and return to the 50% method for future corrections. - These decisions are taken on a tower by tower basis. ------------------------------------------------------------------------------ DATE: 18:20-OCT-2007 At: Fermi TOPICS: Return the MSU PROM Programmer to Fermi, Need to modify a FOM++, Check TFW Power Supplies, Check 4 TT signals, Meetings, L2 miss match Logic Analyzer results Petr Neustroev is back at D-Zero and on the last trip he asked to use the MSU PROM programmer again. I brought it and its half alive computer back down here. They are setup in the normal place. The only secret to running them is that you have to turn on both the power strip next to the telephone and the power strip behind the PROM programmer's computer and monitor. I forgot again to bring the good soldering stuff here to modify a second FOM++ card so that it can handle making the Hardware L1_Qualifiers. I need to get this done. Check the TFW Power Supply Voltages again to see if they have drifted since the 13,14-Sept-2007 log book entry. TOM Front Panel Test Points --------------------------------- Power M122 M122 M122 Supply Top Mid Bot ------ ------- ------- ------- +5.0V 5.041 5.023 5.050 +3.3V 3.331 3.331 3.333 -2.0V 2.004 2.025 2.018 -4.5V 4.504 4.507 4.519 TOM Front Panel Test Points --------------------------------- Power M123 M123 M123 Supply Top Mid Bot ------ ------- ------- ------- +5.0V 5.021 5.032 5.026 +3.3V 3.334 3.335 3.331 -2.0V 2.004 2.005 2.006 -4.5V 4.497 4.503 4.504 Power Supply Terminals ------------------------ Power M124 M124 Supply Top Mid ------ ------- ------- +5.0V 5.065 5.047 +3.3V 3.348 3.331 -2.0V 2.045 2.045 -4.5V 5.220 5.226 Look at TT signals Sabine reports from her new calibration data that 3 TT signals are responding at only 1/2 of the expected level in the L1 Cal Trig. These are: +8,6 HD -11,12 HD -12,32 HD +8,6 HD Looking with the scope at the Patch Panel monitor point it looks like the HD+ signal is not terminated by the ADF-2 card, i.e. the HD+ signal looks twice as big as it should be. Thus we expect an open circuit in the HD+ line on either the Patch Panel card, the ATC card, or in the PFC cable. The Patch Panel card tested out OK. Looking with the Ohm meter at the Patch Panel end of the PFC cable I can not see the 1 Meg Ohm DC discharge resistor for the HD+ line in the ADF-2 card. As it is Friday night and I will leave in the morning I'm not going to shutdown the ADF Crate and pull out the ATC card to test it. This repair will have to wait until the beginning of my next trip to Fermi (and for when there is no Physics beam). -11,12 HD Looking with the scope at the Patch Panel monitor point it looks like the HD- signal is missing. Looking directly at the BLS cable with the Ohm meter I see no indication of a shorted BLS cable and I can see the output capacitor on the BLS Driver hybrid (i.e. the BLS cable is not open). Looking with the scope, with the Precision Pulser running, you can see a very small signal on the HD- line, about 5% of the expected amplitude and of the wrong sign. -12,32 HD Looking with the scope at the Patch Panel monitor point it looks like the HD- signal is missing. Looking directly at the BLS cable with the Ohm meter I see no indication of a shorted BLS cable and I can see the output capacitor on the BLS Driver hybrid (i.e. the BLS cable is not open). Looking with the scope, with the Precision Pulser running, you can see a very small signal on the HD- line, about 5% of the expected amplitude and of the correct sign. While scanning all of the TT signals in the same Patch Panel as the reported bad +8,6 HD TT, we discovered that +8,8 EM was also bad. +8,8 EM Looking with the scope at the Patch Panel monitor point it looks like the EM+ signal is missing. Looking directly at the BLS cable with the Ohm meter I see no indication of a shorted BLS cable and I can see the output capacitor on the BLS Driver hybrid (i.e. the BLS cable is not open). Meetings: MicroBooNE group meeting. Meeting with Mitch which resulted in a sign off on the BVDC card layout. Look at the output from the logic analyzer for some more L2 miss match errors that Jim Kraus has recorded and Recall the current setup of connections to the logic analyzer: Logic Analyzer Input Drew DeMux I.E. L2_Answers for Pod Signals Card Output L1 Specific Triggers --- ------- ----------- -------------------- A3 0:7 24:31 24:31 A2 0:7 16:23 16:23 A1 0:7 8:15 8:15 A0 0:7 0:7 0:7 C3 0:7 72:79 72:79 C2 0:7 64:71 64:71 C Clock_3 L2_Answer_Strobe D0 0 Mismatch Flag signal from L2 Global File written on 2-Oct-2007 at 2:18 PM ---------------------------------------- L2 Global set the miss match flag 45 usec after the TFW received the last set of L2 Answers from L2 Global. The last set of L2 Answers was: LA pod A0 = $30, i.e. Pass for trigger numbers 4 and 5 Reject for all others 396 usec earlier than that the TFW received a set of L2 Answers that were all Reject. 733 usec earlier than that the TFW received a set of L2 Answers that were: LA pod A0 = $04, i.e. Pass for trigger number 2. Reject for all others. File written on 4-Oct-2007 at 1:32 PM ---------------------------------------- L2 Global set the miss match flag 39 usec after the TFW received the last set of L2 Answers from L2 Global. The last set of L2 Answers was: LA pod A0 = $D0, i.e. Pass for trigger numbers 4, 6, and 7 Reject for all others 653 usec earlier than that the TFW received a set of L2 Answers that were all Reject. 976 usec earlier than that the TFW received a set of L2 Answers that were all Reject. 117 usec earlier than that the TFW received a set of L2 Answers that were all Reject. 178 usec earlier than that the TFW received a set of L2 Answers that were: LA pod A0 = $04, i.e. Pass for trigger number 2. Reject for all others. File written on 9-Oct-2007 at 12:45 PM ---------------------------------------- L2 Global set the miss match flag 34 usec after the TFW received the last set of L2 Answers from L2 Global. The last set of L2 Answers was: LA pod A0 = $F0, pod A1 = $00, pod A2 = $A7, pod A3 = $2B pod C2 = $30, pod C3 = $13 3.8 msec earlier than that the TFW received a set of L2 Answers that were all Reject. 620 usec earlier than that the TFW received a set of L2 Answers that were all Reject. 400 usec earlier than that the TFW received a set of L2 Answers that were: LA pod A0 = $04, i.e. Pass for trigger number 2. Reject for all others. File written on 9-Oct-2007 at 6:47 PM ---------------------------------------- L2 Global set the miss match flag 67 usec after the TFW received the last set of L2 Answers from L2 Global. The last set of L2 Answers was: LA pod A0 = $40, i.e. Pass for trigger number 6 Reject for all others In the previous 20 msec history stored in the Logic Analyzer there were no other sets of L2 Answers sent to the TFW. File written on 16-Oct-2007 at 16:04 PM ------------------------------------------ L2 Global set the miss match flag 35 usec after the TFW received the last set of L2 Answers from L2 Global. The last set of L2 Answers was: LA pod A0 = $04, i.e. Pass for trigger number 2 Reject for all others 797 usec earlier than that the TFW received a set of L2 Answers that were all Reject. 964 usec earlier than that the TFW received a set of L2 Answers that were all Reject. 1.7 msec earlier than that the TFW received a set of L2 Answers that were: LA pod A0 = $04, i.e. Pass for trigger number 2. Reject for all others. File written on 17-Oct-2007 at 12:11 PM ------------------------------------------ L2 Global set the miss match flag 39 usec after the TFW received the last set of L2 Answers from L2 Global. The last set of L2 Answers was: all Reject 1.77 msec earlier than that the TFW received a set of L2 Answers that were all Reject. 418 usec earlier than that the TFW received a set of L2 Answers that were all Reject. 1.14 msec earlier than that the TFW received a set of L2 Answers that were: LA pod A0 = $04, i.e. Pass for trigger number 2. Reject for all others. File written on 19-Oct-2007 at 8:08 AM ---------------------------------------- L2 Global set the miss match flag 35 usec after the TFW received the last set of L2 Answers from L2 Global. The last set of L2 Answers was: LA pod A0 = $E0, pod A1 = $00, pod A2 = $27, pod A3 = $6E pod C2 = $30, pod C3 = $13 582 usec earlier than that the TFW received a set of L2 Answers that were all Reject. 506 usec earlier than that the TFW received a set of L2 Answers that were: LA pod A0 = $04, i.e. Pass for trigger number 2. Reject for all others. 856 usec earlier than that the TFW received a set of L2 Answers that were all Reject. ------------------------------------------------------------------------------ DATE: 26:28-Sept-2007 At: Fermi TOPICS: Logic Analyzer look at L2_Answers from L2 Global, PROM Programmer, Spare TFW power supplies, Talked with Geoff about TCC's, FPD removal Look at the captured files on the logic analyzer that is tied up to the Drew DeMux card output and to the Mismatch Flag signal from L2 Global. Jim and company have captured 9 files. I checked the filenames and the operating system date stamps on the files to make certain that everything lined up OK. All looks OK except for one of them (which is marked in the list below and which I think that I over wrote when I started poking at the logic analyzer to look at the captured files). In operating system time stamp order the files are: Operating System FileName Time Stamp ----------------------- ------------------------- l2_de_17sep07_246pm 17-Sept-2007 2:47 PM system3 19-Sept-2007 11:06 AM system5 20-Sept-2007 3:25 PM l2_de_21sep07_529pm 21-Sept-2007 5:58 PM l2_de_22sep07_307pm 22-Sept-2007 3:08 PM system8_23sept_1237pm 23-Sept-2007 1:05 PM l2_de_23sep07_1115pm 23-Sept-2007 11:16 PM system10_sept24_335pm 24-Sept-2007 4:05 PM l2_de_27sep07_752pm 28-Sept-2007 12:44 PM <----- Recall the current setup of connections to the logic analyzer: Logic Analyzer Input Drew DeMux I.E. L2_Answers for Pod Signals Card Output L1 Specific Triggers --- ------- ----------- -------------------- A3 0:7 24:31 24:31 A2 0:7 16:23 16:23 A1 0:7 8:15 8:15 A0 0:7 0:7 0:7 C3 0:7 72:79 72:79 C2 0:7 64:71 64:71 C Clock_3 L2_Answer_Strobe D0 0 Mismatch Flag signal from L2 Global Recall that the Logic Analyzer is monitoring the L2_Answer_Strobe on the L2_Answer 64:79 cable from the Drew DeMux card because that is the L2_Answer cable that gives the "Strobe" signal to the TFW to tell the TFW that a new set of L2_Answers has just arrived from L2 Global. For compactness in some places below I have used the word "event" in place of L2_Decision. Look at logic analyzer file: l2_de_17sep07_246pm --------------------- Drew DeMux Card Output L2_Answers Logic Analyzer --------------------------------- Setup Time Before Input Signals L2 Answers from L2 Answers for DeMux Card Strobe Pod (L1 Trig) Previous Event THIS Event for THIS Event -------------- --------------- -------------- ------------------ A3 24:31 00 D7 802 nsec A2 16:23 00 FA 802 nsec A1 8:15 00 00 A0 0:7 00 F0 902 nsec C3 72:79 00 13 500 nsec C2 64:71 00 30 500 nsec Notes: The Strobe width for the L2_Answers for THIS event was 400 nsec. The delay between receiving the L2_Answers for THIS event to receiving the Mismatch Flag from L2_Global was about 35 usec. The Mismatch Flag signal was 342 nsec wide. The Previous L2_Decision was about 1 millsec earlier. The L2_Answers for the Previous L2_Decision were all zeros. The Previous Previous L2_Decision was about 116 usec earlier. The L2_Answers for the PP L2_Decision were all zeros. The PPP L2_Decision was about 921 usec earlier. The L2_Answer for the PPP L2_Decision was 04 in logic analyzer pod A0 and all zeros elsewhere, i.e. L2 answered Pass for L1 Specific Trigger Number 2 and Reject for all others. The setup time for these L2_Answers was 896 nsec before a 404 nsec wide Strobe signal. Look at logic analyzer file: system3 ----------- Drew DeMux Card Output L2_Answers Logic Analyzer --------------------------------- Setup Time Before Input Signals L2 Answers from L2 Answers for DeMux Card Strobe Pod (L1 Trig) Previous Event THIS Event for THIS Event -------------- --------------- -------------- ----------------- A3 24:31 00 00 A2 16:23 00 00 A1 8:15 02 00 896 nsec A0 0:7 00 00 C3 72:79 00 00 C2 64:71 00 00 Notes: The Strobe width for the L2_Answers for THIS event was 404 nsec. The delay between receiving the L2_Answers for THIS event to receiving the Mismatch Flag from L2_Global was about 697 usec. <-- The Mismatch Flag signal was 236 nsec wide. The Previous L2_Decision was about 282 usec earlier. The L2_Answer for the Previous L2_Decision was 02 in logic analyzer pod A1 and all zeros elsewhere, i.e. L2 answered Pass for L1 Specific Trigger Number 9 and Reject for all others. Look at logic analyzer file: system5 ----------- Drew DeMux Card Output L2_Answers Logic Analyzer --------------------------------- Setup Time Before Input Signals L2 Answers from L2 Answers for DeMux Card Strobe Pod (L1 Trig) Previous Event THIS Event for THIS Event -------------- --------------- -------------- ----------------- A3 24:31 - 00 A2 16:23 - 00 A1 8:15 - 00 A0 0:7 - 01 878 nsec C3 72:79 - 00 C2 64:71 - 00 Notes: The Strobe width for the L2_Answers for THIS event was 402 nsec. The delay between receiving the L2_Answers for THIS event to receiving the Mismatch Flag from L2_Global was about 35 usec. The Mismatch Flag signal was 216 nsec wide. The L2 Answer for THIS event was a Pass for L1 Specific Trigger Number 0 and a Fail for all others. There was no Previous L2_Decision recorded in the full history memory of the logic analyzer. Look at logic analyzer file: l2_de_21sep07_529pm --------------------- Drew DeMux Card Output L2_Answers Logic Analyzer --------------------------------- Setup Time Before Input Signals L2 Answers from L2 Answers for DeMux Card Strobe Pod (L1 Trig) Previous Event THIS Event for THIS Event -------------- --------------- -------------- ----------------- A3 24:31 00 DC 780 nsec A2 16:23 00 CC 780 nsec A1 8:15 00 00 A0 0:7 00 F0 882 nsec C3 72:79 00 13 482 nsec C2 64:71 00 30 482 nsec Notes: The Strobe width for the L2_Answers for THIS event was 416 nsec. The delay between receiving the L2_Answers for THIS event to receiving the Mismatch Flag from L2_Global was about 35 usec. The Mismatch Flag signal was 216 nsec wide. The Previous L2_Decision was about 99 usec earlier. The L2_Answers for the Previous L2_Decision were all zeros. The PP L2_Decision was about 590 usec earlier. The L2_Answer for the PP L2_Decision was 04 in logic analyzer pod A0 and all zeros elsewhere, i.e. L2 answered Pass for L1 Specific Trigger Number 2 and Reject for all others. Look at logic analyzer file: l2_de_22sep07_307pm --------------------- Drew DeMux Card Output L2_Answers Logic Analyzer --------------------------------- Setup Time Before Input Signals L2 Answers from L2 Answers for DeMux Card Strobe Pod (L1 Trig) Previous Event THIS Event for THIS Event -------------- --------------- -------------- ----------------- A3 24:31 00 EC 782 nsec A2 16:23 00 5C 782 nsec A1 8:15 00 00 A0 0:7 04 7D 882 nsec C3 72:79 00 13 480 nsec C2 64:71 00 30 480 nsec Notes: The Strobe width for the L2_Answers for THIS event was 420 nsec. The delay between receiving the L2_Answers for THIS event to receiving the Mismatch Flag from L2_Global was about 36 usec. The Mismatch Flag signal was 216 nsec wide. The Previous L2_Decision was about 186 usec earlier. Look at logic analyzer file: system8_23sept_1237pm ----------------------- Drew DeMux Card Output L2_Answers Logic Analyzer --------------------------------- Setup Time Before Input Signals L2 Answers from L2 Answers for DeMux Card Strobe Pod (L1 Trig) Previous Event THIS Event for THIS Event -------------- --------------- -------------- ----------------- A3 24:31 00 D5 780 nsec A2 16:23 00 D2 780 nsec A1 8:15 00 00 A0 0:7 04 7D 876 nsec C3 72:79 00 13 476 nsec C2 64:71 00 30 476 nsec Notes: The Strobe width for the L2_Answers for THIS event was 400 nsec. The delay between receiving the L2_Answers for THIS event to receiving the Mismatch Flag from L2_Global was about 44 usec. The Mismatch Flag signal was 198 nsec wide. The Previous L2_Decision was about 390 usec earlier. Look at logic analyzer file: l2_de_23sep07_1115pm ---------------------- Drew DeMux Card Output L2_Answers Logic Analyzer --------------------------------- Setup Time Before Input Signals L2 Answers from L2 Answers for DeMux Card Strobe Pod (L1 Trig) Previous Event THIS Event for THIS Event -------------- --------------- -------------- ----------------- A3 24:31 00 97 776 nsec A2 16:23 00 05 776 nsec A1 8:15 00 00 A0 0:7 04 70 880 nsec C3 72:79 00 13 488 nsec C2 64:71 00 30 488 nsec Notes: The Strobe width for the L2_Answers for THIS event was 400 nsec. The delay between receiving the L2_Answers for THIS event to receiving the Mismatch Flag from L2_Global was about 182 usec. The Mismatch Flag signal was 198 nsec wide. The Previous L2_Decision was about 1.3 millisec earlier. Look at logic analyzer file: system10_sept24_335pm ----------------------- Drew DeMux Card Output L2_Answers Logic Analyzer --------------------------------- Setup Time Before Input Signals L2 Answers from L2 Answers for DeMux Card Strobe Pod (L1 Trig) Previous Event THIS Event for THIS Event -------------- --------------- -------------- ----------------- A3 24:31 00 00 A2 16:23 00 00 A1 8:15 00 00 A0 0:7 04 04 C3 72:79 00 00 C2 64:71 00 00 Notes: The Strobe width for the L2_Answers for THIS event was 402 nsec. The delay between receiving the L2_Answers for THIS event to receiving the Mismatch Flag from L2_Global was about 35 usec. The Mismatch Flag signal was 538 nsec wide. The Previous L2_Decision was about 2 millisec earlier. Other items this trip: Peter Neustroev is back from Russia and he will be here for the next 3 months. He would like to use MSU's PROM programmer again. I told him that I would bring it back to D-Zero. I need to cook a spare SCLD_Substitute PROM before I bring it here again (and maybe another Becane_Too boot PROM). Collaboration meeting this trip. The next collaboration meetings are: December 3-7, 2007 Fermilab February 18-22, 2008 Fermilab May 19-23, 2008 Fermilab August 11-16, 2008 Prague November 10-14, 2008 Fermilab Run the two spare TFW power supply chassis that are at D-Zero Hall. They are TFW power supplies SN#2 and SN#4. Set these spare power supply chassis to the average of what the power supply terminal Voltages are in the running TFW (in order to give the proper Voltage at the TFW crate loads). Power Supply Terminals ------------------------------------------------- M122 M123 ----------------------- ---------------------- Top Mid Bot Top Mid Bot Average ----- ----- ----- ----- ----- ----- ------- 5.089 5.054 5.112 5.053 5.064 5.054 5.071 3.329 3.332 3.348 3.325 3.334 3.331 3.333 2.167 2.134 2.119 2.143 2.116 2.097 2.129 4.691 4.623 4.620 4.653 4.618 4.599 4.634 Geoff asked about plans for swapping TCC's during this shutdown. I told him that we (Philippe and I) had never received a reply from the bosses and thus had done nothing. We also talked about the general emergency swap plans. He will contact Philippe directly so that they can discuss this topic. Rich Partridge removed the FPD stuff from M114 M115 area. I pulled out from the Master Clock rack the NIM stuff that was feeding signals to the FPD stuff, i.e. pulled out the NIM monitor signal cables from the clock fanout modules and pulled out a then unused NIM card from the "private" NIM crate in the Master Clock Rack. This was feeding signals to the FPD stuff on the MCH-1 back porch. There are still 2 face racks of abandon FPD stuff (NIM and CAMAC crates) running on the back porch mixed in with what looks like an important SMT heater supply distribution. At PAB I disconnected the Trig rack and the DAQ rack from the Bo cryostat for the ILC pump move. Tour of PAB setup BW and VR. ------------------------------------------------------------------------------ DATE: 13,14-Sept-2007 At: Fermi TOPICS: TFW problem after the weekend power outage Thursday morning the system was clearly stuck, i.e. it was running at only 1 or 2 Hz, TFW $1F was 99.8% L1 Busy, and Routing Master was seeing only even L3_Transfer_Numbers. Read Reg 40 of Chip 1 on the FOM++. It looks like a good balance of even and odd. It is clear that the contents of this register only update with the Capture Monitor Data signal, which happens two times every 5 seconds because of the 2 sweeps of monitor data, i.e. one for TrgMon type customers and one for Luminosity type customers. I looked at the L3_Transfer_Number cable where it plugs into the back of the Routing Master. I could clearly see that the LSBit was not changing. The LSBit of the L3_Transfer_Number was a good differential ECL signal but it was not changing. I could see higher order bits changing. I looked at the Clock Enable signal that comes into the D-Latch FM card for the L3_Transfer_Number going to the Routing Master. This clock enable signal looked clean and nice. I do not know that it was happening at the right time but this signal looked clean and exactly like it should - the right width. Start following the L3_Trans_Number / L1_Qualifier path from the FOM++ output, past the D-Latch, over to the SCL Hub-End. As I was following this path, wiggling things a little, trying to find the best place to tap in and look at the LSBit - things started to work again. This was about 11 AM. I do not know exactly what I wiggled to fix it. It could even have been wiggling the Clock Enable signal to the D-Latch. Work on checking Power Supplies. The Vee supply in M123 Bot was only about -4.400V so I decided to check and adjust all the supplies in M122 and M123. Adjusted supplies are marked with a "*". Only the M123 Bottom Vtt and Vee were very far off. Was M123 Bottom the supply that was replaced some years ago ? M122 Middle -2V is a bit high but I would need to do some major bending of power cables to get at its trim pot. Also check the Voltage from the TOM card's black ground test point to the rack ground. And also check the Voltage drop on the common return run from the power supply common to the backplane common. Measure everything on Thursday afternoon and then check the TOM test points again on Friday. M122 Top TOM Ground to Rack Ground = 0 mV ----------- Power Supply Common to Backplane Common = -14 mV Power TOM Back Plane Power Supply On Friday Supply Test Points Bus Bars Terminals TOM Test Pnts ------ ----------- ---------- ------------ ------------- +5.0V 5.042 5.051 5.089 5.043 +3.3V 3.332 3.333 3.329 3.332 * -2.0V 2.003 2.015 2.167 2.004 * -4.5V 4.505 4.512 4.691 4.505 M122 Middle TOM Ground to Rack Ground = 0 mV ----------- Power Supply Common to Backplane Common = -8 mV Power TOM Back Plane Power Supply On Friday Supply Test Points Bus Bars Terminals TOM Test Pnts ------ ----------- ---------- ------------ ------------- * +5.0V 5.023 5.030 5.054 5.023 * +3.3V 3.333 3.333 3.332 3.332 -2.0V 2.025 2.030 2.134 2.026 * -4.5V 4.504 4.508 4.623 4.505 M122 Bottom TOM Ground to Rack Ground = 0 mV ----------- Power Supply Common to Backplane Common = -8 mV Power TOM Back Plane Power Supply On Friday Supply Test Points Bus Bars Terminals TOM Test Pnts ------ ----------- ---------- ------------ ------------- +5.0V 5.049 5.058 5.112 Fz 5.049 * +3.3V 3.333 no access 3.348 Fz 3.334 -2.0V 2.017 2.027 2.119 2.017 -4.5V 4.517 4.522 4.620 4.519 M123 Top TOM Ground to Rack Ground = 0 mV -------- Power Supply Common to Backplane Common = -14 mV Power TOM Back Plane Power Supply On Friday Supply Test Points Bus Bars Terminals TOM Test Pnts ------ ----------- ---------- ------------ ------------- +5.0V 5.022 5.032 5.053 5.023 * +3.3V 3.332 no access 3.325 3.333 -2.0V 2.002 2.013 2.143 2.002 -4.5V 4.501 4.506 4.653 4.497 M123 Middle TOM Ground to Rack Ground = 0 mV ----------- Power Supply Common to Backplane Common = -7 mV Power TOM Back Plane Power Supply On Friday Supply Test Points Bus Bars Terminals TOM Test Pnts ------ ----------- ---------- ------------ ------------- +5.0V 5.035 5.045 5.064 5.035 * +3.3V 3.332 no access 3.334 3.333 * -2.0V 2.005 2.015 2.116 2.005 * -4.5V 4.502 4.506 4.618 4.503 M123 Bottom TOM Ground to Rack Ground = 0 mV ----------- Power Supply Common to Backplane Common = -7 mV Power TOM Back Plane Power Supply On Friday Supply Test Points Bus Bars Terminals TOM Test Pnts ------ ----------- ---------- ------------ ------------- +5.0V 5.028 5.039 5.054 5.028 * +3.3V 3.333 3.333 3.331 3.332 * -2.0V 2.006 2.018 2.097 2.007 * -4.5V 4.504 4.508 4.599 4.503 M124 Top SCL Hub-End VME Communications Crate ------------------------------------------------- Backplane Common to Rack Ground = 1 mv Power Supply Common to Backplane Common = +12 mV Power Back Plane Power Supply Supply Bus Bars Terminals ------ ---------- ------------ +5.0V 5.023 5.065 +3.3V 3.308 3.347 -2.0V 2.016 2.048 -4.5V 5.144 5.228 M124 Bottom TFW Readout ------------------------- Backplane Common to Rack Ground = 0 mv Power Supply Common to Backplane Common = +10 mV Power Back Plane Power Supply Supply Bus Bars Terminals ------ ---------- ------------ +5.0V 5.010 5.049 +3.3V 3.321 3.332 -2.0V 2.044 2.048 -4.5V 5.232 5.226 The FOM++ card that has been running in the TFW for the past "N" years is AONM card SN# FOM-13B. - Modification of FOM-13B to handle the "Hardware L1_Qualifiers" is covered in the 6:7-Nov-2003 log book entry and in www.pa.msu.edu/hep/d0/ftp/l1/framework/hardware/aonm/ l1_qualifier_signal_path.txt The L3_Transfer_Number D-Latch card that has been running in the TFW for the past "N" years is FM card SN# FM-17. - The spare FM cards at D-Zero are SN# FM-06 and FM-02. Based on how the address switches are set on these cards I believe that FM-06 must have been the old L1 Cal Trig Readout Helper and thus is has lots of hours on it and it has not been flaky. BUT I just checked and L1 Cal Trig Readout Helper used Chip #4 on its FM card and the L3 Trans number is currently Latched by Chip #13 in it FM D-Latch card. - Note that the Output only of the L3_Transfer_Number D-Latch card is cabled up so that it could also send the Tick and Turn number over to the Routing Master. - Only the L3_Transfer_Number is delivered to the input of this D-Latch card. It is delivered as MSA_In 112:127. It is processed by Chip #13. It comes out as MSA_Out 32:47. - The Clock Enable signal for the Latch appears to arrive on pins 1 and 2 of the front panel auxilary connector. - If we read Register #8 of Chip #13 we can see the L3_Transfer_Number that has been latched and held by this card. The documentation says that this only updates with Capture_Monitor_Data but maybe that is not how the D-Latch FPGA design was actually made or maybe the BSF on the D-Latch card is not programmed to deliver the Cap_Mon_Data signal on the required HQ_Timing line to the Main Array. Thursday evening: power down, pull the Rear Paddle card off of M123 Middle Slot 20, i.e. the "PassThrough" slot for the output of the FOM++. This Rear Paddle card and the cable connectors that plug into it look just file. - FOM++ MSA_Out 0:15 are the L1_Qual / L3_Trans_Num for the first 8 Fan Out cards in the SCL Hub-End and provide the L3_Trans_Num to the D-Latch that feeds the Routing Master. - FOM++ MSA_Out 16:31 are the L1_Qual / L3_Trans_Num for the second set of 8 Fan Out cards in the SCL Hub-End. Next pull off the Rear Paddle card that feeds the L3_Tran_Num into M123 Bottom Slot 20, i.e. the L3_Trans_Num D-Latch. This Rear Paddle card and the connectors that plug into it look fine. I check the Paddle Card traces for resistance as I flexed the card. I did not pull the paddle card off of the front of the FOM++ nor did I un-stack M123 Middle so that I coule pull out and test the PassThrough card for the FOM++ output signals. No need to check the Front Paddle card on the D-Latch because this morning, while it was sending all evens, I know that there was a good ECL signal on the LSBit at the Routing Master end of the L3_Trans_Num cable. Pull out FOM SN# 17 which has been the L3_Trans_Num D-Latch for the past "N" years. Install FM SN# 06 in its place. Card species set to $30. Card Address set to $3A. Install two 120 Ohm resistor packs at the pin #1 end of the P5 connector, i.e. the front panel auxiliary connector. - An alternative and better idea than replacing the FM D-Latch card would have been to switch to using a different block of D-Latching on the same card, i.e. switch to using MSA_In 0:15 Chip #4 MSA_Out 0:15. Doing this would require: changing the Front Paddle Card so that the Strobe to the Routing Master gets put on wires 33,34 of the MSA_Out 0:15 cable instead of onto the MSA_Out 32:47 cable as it is now. It may also require: Configuring logic into Chip #4 and setting up registers in Chip #4 or the BSF. I did not check in any detail on these last points. After changing the FM D-Latch card for the L3_Trans_Num then Configure 1531 with zero error. Initialize. Start ZB with just $1F and $20. Notes: I doubt that this was a break between the ECL output driver on the FM D-Latch card because I saw good diff ECL at the Routing Master end of the cable. I don't this it can be a break between the FPGA output pin and the ECL driver because then the TTL input to the ECL driver would float HI and the LSBit was always Low. Friday. It ran over night with no problems. The Routing Master log has not shown a problem with the L3_Transfer_Number since about 10 Thursday morning. Check the TFW power supplies again at the TOM test points. They look OK. There were three spare version B AONM/FOM cards at Fermi. They are serial numbers SN# FOM-09B FOM-26B and AONM-05B. I brought down two more revision B AONM/FOM cards. They are serial numbers: SN# AONM-08B and SN# AONM-19B. I'm going to leave all 5 of these spare revision B AONM/FOM down here. On the next trip down I will bring the tools so that I can add the white wires to make another FOM++ that can handle the Hardware L1_Qualifiers. I *think* that such a modified AONM/FOM can still be used as a normal AONM/FOM. Carefully check out the FM D-Latch card that was pulled out last night (SN# FM-17) Check all of the traces and vias and such that are involved with the LSBit of the L3_Trans_Num, i.e. MSA_In 112 and MSA_Out 32. I could find no problems in any of this. Checked with Ohm meter while doing a moderate amount of flexing. If we get into this L3_Trans_Num problem again, and if we are in the middle of good beam Physics running, then we may need to tell people over the phone how to ground themselves and wiggle some cards and cables. I would wiggle: the Front Paddle card on the FOM++, the Rear Paddle card on the PassThrough card, the Rear Paddle card on the FM D-Latch. I have put an obvious big label on the cable that runs between these two Rear Paddle cards. One of them is aproximately right above the other in M123 Middle and Bottom. The label says, "This One". Drew DeMux Card and the Logic Analyzer. I will leave the Logic Analyzer out on the MCH-1 floor tied up to the Drew DeMux (as explained in last week's log book entry) and to the "flag" signal from L2 Global that says that L2 Global has received, over the SCL, an L2_Decision that does not match what it was expecting. This flag is used to trigger (stop) the logic analyzer. We tested looking at this flag signal. I pick it off with a specal adaptor cable down in the floor with the ADAM_PB card for the scaler control signals from L2 Global. The logic analyzer is sampling every 20 nsec and recording 1 mega samples with the trigger set to 97% of the way through the histor. The flag signal is on group D0 signal 0. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - DATE: 12-Sept-2007 At: MSU action at Fermi TOPICS: TFW problem after the weekend power outage Wednesday. The problem came back (sometime Tuesday night I guess) We tried reading Reg 48 in Chip 1 of the FOM++, i.e. the L3_Transfer_Number monitor data read register. While the DAQ was in trouble we thought that this register was even more of the time than it was odd - but we only read it 8 or 10 times or so. With the TFW Paused we also read all 8 FPGAs, i.e. Chips 1:8 on the FOM++ that have a copy of the L3_Transfer_Number. All 8 copes were the same. We tried Configuring just: L2_Helper, D-Latch for the L3_Transfer_Number going to the Routing Master, and the FOM++. This got things running again. It failed again later in the afternoon. We tried Configuring: just L2_Helper, then just D-Latch, then once at a time just logical subsets of FPGAs in the FOM++. None of that helped. Then we Configured all 3 of these cards and things started to work again for a while. Give up and drive to Fermi. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - DATE: 10-Sept-2007 At: MSU action at Fermi TOPICS: TFW problem after the weekend power outage Monday. They called at about 5PM because they had not been able to get the DAQ running again after power had been restored after the weekend power outage. When trying to run ZB: $1F was L1 Busy about 99.8%, there were about 14 L1 decisions awaiting their L2 answers, people said that the Routing Master was mostly only seeing even "event numbers", i.e. L3 Transfer Numbers. Of the 4 L1 Trigs that make up ZB, if we turn off all but the 500 Hz L1 Trig that always receives a L2 Reject then things ran fine. At about 7:30 PM they reloaded the Master Clock Time Lines and that did not help. At about 8PM they Configured all the TFW FPGAs and then things took off and ran OK. ------------------------------------------------------------------------------ DATE: 6:8-Sept-2007 At: Fermi TOPICS: Work on connecting Logic Analyzer to the Drew DeMux card output Brought to D-Zero the Tektronix TLA-622 logic analyzer with its instruction book and mouse and keyboard. Also made and brought to Fermi a 64 channel card to "bridge" across differential ECL signals and turn then into TTL level signals that you can feed into the logic analyzer. Setup of the Diff ECL Adapter Card and Logic Analyzer Input There are a total of 64 channels of Diff ECL to Logic Analyzer Input on this adapter card. The twist-and-flat cables at the input to the adapter card have 4 of the 34 pin IDC connectors. The first and second 34 pin IDC connectors (the first one is labeled 0:15 and the second one is labeled 16:32) have 16 signals pass through them, i.e. pins 33 and 34 are not wired up between the male and female IDC connectors in these circuits. The third IDC connector (labeled 15) has just 15 circuits through it wired up, i.e. pins 31, 32, 33, and 34 are not wired from the male to the female connectors in these circuits. The forth IDC connector (labeled 17) has all 17 circuits in it wired up, i.e. all 34 pins run from the male to the female connectors of these circuits. Logic Analyzer Diff ECL Adapter Card Input ------------------------- -------------- Connector Pins Pod Signals --------- ------------- --- ------- 0:15 1,2 - 15:16 A0 0:7 0:15 17,18 - 31:32 A1 0:7 16:31 1,2 - 15:16 A2 0:7 16:31 17,18 - 31:32 A3 0:7 15 1,2 - 15:16 D0 0:7 15 17,18 - 29:30 D1 0:6 17 1,2 - 15:16 C2 0:7 17 17,18 - 31:32 C3 0:7 17 33,34 C Clock_3 So we can not see on the Logic Analyzer Pod D1 signal 7 with the current setup of the 64 channel Diff ECL to Logic Analyzer card. Verified that the L2_Answer Strobe the the L2FW is using really comes from the 64:79 output of the Dew DeMux card. Tie up to the Drew DeMux card output so that: Logic Analyzer Input Drew DeMux Pod Signals Card Output --- ------- ----------- A0 0:7 0:7 A1 0:7 8:15 A2 0:7 16:23 A3 0:7 24:31 C2 0:7 64:71 C3 0:7 72:79 C Clock_3 L2_Answr_Strobe I made a 12 pair male to female extension cable so that I could put it in the circuit of the "scaler control cable" from the L2 Global node so that I could tap off the "8th" signal from the L2 Global node as agreed with Bob and Jim to use it as the "L2 Global has seen a problem with the L2_Decision that if received over the SCL". The initial testing of this signal from the L2 Global node is covered in the 26:28-JUNE-2007 log book entry. I have tied this signal to the logic analyzer pod "D0" signal 0. Log book entries about the initial connection of the L2 system to the L2FW and initial testing with the L2 Global node are in the log book: 10-JAN-02, 16-JAN-02, 27-FEB-02, 12-MAR-02. ------------------------------------------------------------------------------ DATE: 20:22-AUG-2007 At: Fermi TOPICS: Meetings, No Logic Analyzer for L2, Checks Meetings on Monday and Tuesday. Walk throughs of MiniBooNE and Minos Near Detector. I did not bring the Logic Analyzer and 64 channel Diff ECL to LA box and a power supply for it to D-Zero this trip. I need to get that ready by next trip. Our standard old 4 channel ECL box has been carried off by some mouse (Muon Cal Track Match I assume) so I will have to setup something else for getting the trigger into the LA. All fans are running OK and thinks look closed up OK. It is only the back fan that we are running now in the Routing Master crate. George put signs on the back of the TFW racks saying that no one should play around above the racks. ------------------------------------------------------------------------------ DATE: 17:19-JULY-2007 At: Fermi TOPICS: Bring spare TCC and Optic Bit-3 to Fermi, spare copper Bit-3 at Fermi, close TFW TCC log file, Master Clock rack Brought the spare MSU TCC computer to Fermi. This is the HP xw4200 computer HP SN# 2UA5230VSL with its keyboard, mouse, and monitor output "Y" cable. This is a MS Windows computer. It is the one that had been at PAB doing DAQ-96 on Bo. Installed in it is the PCI end of an optical Bit-3. Also brought to Fermi is the VME end of this Bit-3 and the optical fiber cable that goes along with it. This computer is stored locked in the brown storage cabinet next to the wall on the first floor of D-Zero hall. The VME half of the Bit-3 with its optical fiber are stored in their Bit-3 packaging in the yellow rack storage cabinet next to the brown cabinet along the wall. I verified that this computer booted up OK after its trip to Fermi. I have printed a copy of Philippe's July 13th 2007 "TCC Disaster Recovery" note and left it with this computer in the brown cabinet. I taped a paper tag to the computer to identify it as the spare TCC. Note that there is also a spare "copper Bit-3" in the yellow rack storage cabinet. I think this is a model 617 and it has a spare copper cable with it. Are there any spare parts for the PCI Expansion equipment for the L2 TCC ? Thursday evening at about 20:45 I started a new TFW TCC log file. The newly started log file includes in its filename "20070718". The log file that was just closed has filename "20070711". Copy log file "20070711" to MSU for archive and study. Looking at log file "20070711" I see that there were two resets of the M122 Top crate VME bus. They were one: Resetting M122-Top LBN= 0x00547872 14-Jul-2007 14:47:24.792 Resetting M122-Top LBN= 0x005489c9 17-Jul-2007 10:39:30.886 Friday morning at 8AM I found the front door to the Master Clock rack wide open. Looked at the power supply fans and rack blowers and Meeting at PAB. I looks like there may be 2 months of work before Bo is cold and full of liquid. Got resistor samples from Walter. ------------------------------------------------------------------------------ DATE: 26:28-JUNE-2007 At: Fermi TOPICS: Test L2 Global LA Trig, Collect L2_Acpt Bump data for Dean, M121 Fan and Fuse, Meetings with Mitch and Walter, Computer Bit-3 to MSU Between Stores Jim Kraus loaded a test version of the L2 Global executable that pulsed the 8th bit in the cable that runs from the L2 Global to the Foreign Scalers for L2 State Monitoring. This test executable pulsed this line high each time L2 Global processed an event. I could see a 200 nsec pulse on the scope when connected to this line. On the Adam_PB card this is connector J3 pins 19,20 just as labeled in silk. I think that the 9th bit is pins 17,18. This is a 24 pin connector ! Do we have any 24 pin female headers to make a break-out cable ? The MSU ECL Box was no where to be found - I need to send Ken a note. The Adam_PB card for Global is not in a plastic box as I asked him to do - and thus is is corroded where it has been sitting in water on the concrete floor. I should send him a picture. He did put two of the four Adam_PBs in boxes. I shifted the delay for the "Dean Mode" of the SCLD to the end of the Cal Precision readout process and captured (I think) a couple of L2_Acpt Bumps from the end of Cal readout. The instructions for doing this follow: Instructions for collecting L2_Accept Bump Data for Dean. This set of instructions is mostly from Philippe's note about this from 30 May 2007. To capture L2_Acpt Bump data at the BEGINNING of the Cal Precision Readout Process set the SCLD to "Dean Mode" and open only Key #4 on the SCLD delay DIP switch, i.e. 8x16= 128 ticks of 132 nsec or 16.9 usec of delay between the L2 Decision and the stopping of the circular buffer writes. To capture L2_Acpt Bump data at the END of the Cal Precision Readout Process set the SCLD to "Dean Mode" and open Keys #1, #4, and #8 on the SCLD delay DIP switch, i.e. 1x16= 16 times of 132 nsec or 2.1 usec 8x16= 128 ticks of 132 nsec or 16.9 usec 128x16= 2048 ticks of 132 nsec or 270.0 usec or a total of 289 usec of delay between the L2 Decision and the stopping of the circular buffer writes. Then to capture an L2_Acpt Bump event: - get a ZB run going To collect the Dean L2_Acpt Bump data we must have pure L2_Acpts and not just L2_Decisions. Thus we need to stop all of the L1 Specific Triggers in the ZB mix that receive an L2_Reject. - Stop the normal once every 5 second L1 Cal Trig Monitor Data Collection from the L1Cal GUI main menu, pick "Control/Status" and click "Monitoring --> Stop Collection" - pause the ZB run - from the L1Cal GUI main menu, pick "GUI Cmd File" - click "Locate File..." and navigate to /tcc/L1Cal_IIb_Work/CommandFiles/Collect_RawAdc_MonitData.cmd - click "Exec GUI CmdFile" - this command file will first send 2x COOR-like messages to setup the raw adc memories to the capture monitoring data mode and then it will send lots of RIO messages to reset all the address generators. Then this command will start polling the Write Enable register of one of these ADF cards to see if an event was captured, and print a one line message every second until an event is captured. When you see these messages then - Resume the ZB run (just the L2_Accepted L1 Spec Trigs) This will cause and L2_Acpt to be issued over the SCL. - After a delay, the L2_Acpt will cause the SCLD card to issue the Save_Monitor_Data signal and then this command file will read out all the circular buffers, and splice them into the output file, named e.g. RawAdc_D20070530_T150720_E001.dat in /tcc/L1Cal_IIb_Work/LogFiles/CommandFiles - the command file will stop after one event. - if the command file gets stuck, you can kill the GUI ascii window with no ill effect. There should be two icons at the bottom of the screen. One looks like an FPGA, and it is for the C++ engine, and I can't remember what the one next to it for the GUI looks like, maybe the frog ? - Do this loop of: Pausing all L1 Trigs, running the Collect_RawAdc_ _MonitData.cmd command file, resuming the ZB L1 Specific Triggers that have L2_Acpt for as many L2_Acpt Bump events as you want to take. - When you are finished collecting L2_Acpt Bump events the resume the normal once every 5 second L1 Cal Trig Monitor Data Collection by: from the L1Cal GUI main menu, pick "Control/Status" and click "Monitoring --> Resume Collection" appendix: - To Configure all ADF + TAB/GAB with the "official" software from the L1Cal GUI main menu, pick "Control/Status" click "Configure" - To Initialize all ADF + TAB/GAB including Mike's "canned" L1Cal programming. from the L1Cal GUI main menu, pick "Control/Status" click "Initialize" - If the routine to collect pedestal drift data is running you will need to stop it before collecting the L2_Acpt Bump data. Do this by: from the L1Cal GUI main menu, pick "Control/Status" click "Stop Tracking" We probably have enough of the pedestal drift tracking data for now, and you will not need to bother restarting this service afterwards. M121 Fan and Fuse The pair of blowers in M121 was not running. This happened once before on 5-Oct-2006. I assume that the couple of recent crashes in 0x10 SBC were caused by it being too hot. As in October 2006, the problem was that the 20 Amp fuse in the contactor box for M101 was blown. In the contactor box, only the fuse for the phase that runs the fans was open. I replaced it. I also disconnected the front fan of the two fans (blowers) in this rack. Running just one of these 600 cfm blowers is more than enough air circulation for the equipment that is now running in M101. On Thursday (i.e. after it had been running for many hours) I felt the motor on the rear blower and it was not hot at all. In case that we need to ask about this on the phone, I put a yellow sign, "Blower Motor LED -->" next to the pilot light on the blower tray that indicates that the the blowers have power. I guess that this could be a problem with one of the motor run capacitors. I should replace them during the shutdown. They are: GE 97F9001 7.5 uFd 370 VAC Protected P921 A10000AFC 5501GA29 Dielektrol VI 0452-06 We have not seen this problem so far in M122, M123, M124. A blower motor did fail in M124 in January 2003. Swapped computer and Bit-3 at PAB. Modulo needing a privileged account DAQ-96 is running fine there. I can no longer run it because I do not have the pw. There were 94 runs on the old machine - they wanted this data - and it is not 100% clear to me that they actually got it transfered to their new Fermi computer - so we should hold onto this data for a little while. I'm bringing this MSU box and its Bit-3 back to MSU for a checkout and then it will come to D-Zero as a known good spare. Walter has nice ideas for readout cable/connectors and feedthrough. How to break this out into groups of 16 and 32 channels ? Actual good low noise feedthrough pcb layout ? ------------------------------------------------------------------------------ DATE: 29-MAY:1-JUNE-2007 At: Fermi TOPICS: Dean SCLD card installed Pull from the L1 Cal Trig Communications Crate SCLD card SN#1 which has the SCLD_T5 firmware on it. Install SCLD card SN#2 which has the SCLD_T5_Dean firmware on it. The "External Trigger Delay DIP Switch" is set with key #4 Open and all others keys Closed. This should give 16.9 usec of delay to the L2_Period signal before it makes the Save_Monitor_Data signal. After swapping the SCLD cards and Initializing the L1 Cal Trig the once every 5 seconds monitoring showed, "TAB 6 Chip 1 Status 8004". I paged and then called Mike's cell number. He translated this status word to a problem in the link from ADF at eta: -16:-13 phi: 29:32 to TAB module 6, Chip 1, channel A. I tried all of the "normal" things, i.e. moving around the 3 ADF outputs, changing cables, re-plugging connectors. Nothing worked. Whatever TAB was plugged into the top output on this ADF-2 card ("D" crate slot #5) would get a bad signal and show error status. Panic as the Store is going in. Mike suggests changing ATC cards. That appears to fix the problem. Carefull examination of the ATC card that was pulled out shows that the P0 connector that goes into the backplane was never pressed into the ATC card correctly. Put a tag on this ATC card that was pulled out so that it will not be used again. It had been labeled, "Spare Tested #2". Return to MSU the "white" pc (MSUL1B) that was brought down here to JTAG program the SCLD card with SCLD_T5_Dean firmware. SCLD card SN#1 still has the standard SCLD_T5 firmware in it. Also return to MSU the VME end of the pci to VME interface (the pci end of this pair is in the "white" computer. At PAB we see the planes become more transparent to electrons from the uV light pulser as Bias Voltage is added to the A and C planes. ------------------------------------------------------------------------------ DATE: 23:25-MAY-2007 At: Fermi TOPICS: Dean SCLD Program the eeprom in the spare SCLD, which is SCLD SN#2, with the new Dean SCLD FPGA firmware. At PAB we see electrons from the uV light pulser through vacuum and through 1/2 atmosphere of Argon. ------------------------------------------------------------------------------ DATE: 15:17-MAY-2007 At: Fermi TOPICS: Made cables for the new Luminosity electronics, general inspection Made cables to connect the outputs from the new Luminosity L0 electronics to the TFW And-Or Term inputs and the Foreign Per Bunch Scaler inputs. Rich knows how to switch back and forth between using his old and new electronics to send data to the TFW. All fans and such look OK. The back door was off of M113 and I put it back on. Now days M113 just has the VESDA head and the relay box that lets the VESDA trip off the TFW and M101 if it sees smoke. No changes made to the M122 Top crate. A "review" of the M122 Top crate hardware is in the 3:4-APR-2007 log book entry. I Thursday found that the back doors of all 3 TFW racks had been opened and not fully closed back up. Installed the rest of the DAQ-96 channels at PAB. It looks OK. ------------------------------------------------------------------------------ DATE: 11:13-APR-2007 At: Fermi TOPICS: TRICS to reset M122 Top, DAQ-96 delivery, check Bit-3 inventory Wednesday - Restart TRICS (start the "auto-start" version) so that it picks up the version of TRICS with the recovery code for M122 Top hangs and the version that has the correct 11.2 version number icon on the TRICS menu. Deliver the DAQ-96 equipment to PAB. In the yellow storage rack there is a 617 copper link. There is a 9U fiber optic link setup on the sidewalk - both ends. The are 2 VME halves of fiber optic links in the running L1 Cal Trig Communications Crate. There is no Bit-3 stuff in the windows maching in the L1 Cal Trig racks. ------------------------------------------------------------------------------ DATE: 10-APR-2007 At: MSU TOPICS: New version of TRICS Start version 11.2.A of TRICS running. This is exactly the same as before but this new version has code to test and see if crate M122 Top is hung when TRICS is making its monitor data VME reads. If TRICS finds M122 Top hung then it causes a VME SysReset* to be asserted in M122 Top and then it tries again to read the monitor data. It makes just one attempt to clear the M122 Top VME hange and then moves on (i.e. TRICS can not get itself hung here). ------------------------------------------------------------------------------ DATE: 3:4-APR-2007 At: Fermi TOPICS: Work on the M122 Top Crate VME Monitor Data Readout for the Foreign Per Bunch Scalers, Deliver NEMA and Cu Boxes and Fuseblock to PAB The M122 Top Crate had another "VME hang" on Monday morning 2-Apr-07 at about 10AM. It was discovered when they did the TFW Init during shot setup. They paged and before asking them to push the button Philippe looked at the system from MSU to verify that he could tell from doing write/read that it was hung and he tried to get the VI to issue the VME Sys Reset. Yes, he could tell via write/read that it was hung but we still could not get the Bit-3 to VI communications to issue the Sys Reset. Philippe dug through the TFW TCC log files and looked at the 3 hangs of M122 Top Crate in February and March of 2007. For each of these VME hangs there was an associated Bit-3 Bus Timeout. These were spotted at %%18-Feb-2007 02:35:42.268 %%11-Mar-2007 08:45:45.804 %%12-Mar-2007 10:24:27.846 There is still the big question, is the bus time out caused when the hang first happens or is the hang caused by whatever is causing the bus time out ? Review of things we know about the M122 Top Crate VME Hangs: This problem is associated with a couple of slots but not with what card is in the slot. Once "hung", the card is left asserting is data onto the bus and thinking that it is still in a cycle (i.e. its VME LED is ON --> the state engine thinks it is in a cycle), but it is not asserting its DTACK* signal. VME Sys Reset clears the hung card. We think that the problem starts out with the card not completing a VME cycle, i.e. the cycle is never DTACKed and thus times out. It appears that, giving more setup time for the address lines and such before the falling edge of Adrs Strobe starts the cycle makes the problem go away for awhile (months). Another funny VME bus related thing that we see happens during FPGA Configuration: At a few places in the system, while TCC is Configuring FPGAs in slot X, the VME activity LED will flash in slot Y. We also had trouble with the VME bus in the Routing Master: We had to slow down (or space out) the VME cycles that are initiated by the very fast RM SBC. If we don't slow them down then after a few hundred VME cycles the bus would "hang". I need to look back in the log book to recall the details about how it would hang. It was slowed down by delaying the DTACK* signal (which goes from THE Card back to the RM SBC) by 200 nsec on the TOM card. In the early days, when the Master Clock would sometimes be turned off, and thus the 53 MHz clock to the VME IF chips went away, we also had trouble. If VME cycles were tried, then there was no 53 MHz clock) it would hang the VME bus in that card. This would happen because the monitor I/O was still running while the Master Clock was off. Once stuck like this, the standard way to get running again, was to turn off and then back on the DC power. This whole issue was never studied very much. I do not know if we tried to get "un-stuck" by issuing a VME System Reset. Tuesday I looked at the Timing and Control signals on the backplanes M122 Top and Middle. I now think that I screwed up when I looked at them two weeks ago. I bet that I had the scope set for bandwidth limited. The 53 MHz on the Top backplane does not look that good. It is big enough but is pretty rough looking at some slots. M122 Top is worse than Mid. I pulled off the RAY terminator and then things looked very rough. There is so much reflection that there is almost no 53 MHz near the slot #1 end of the backplane. With the RAY pulled off the monitor readout looked very sick. The VME LEDs for some slots near the slot #1 end were not flashing at all. The cards with flashing VME LEDs did not look right (the duration of the flash looked too long). After a while, one card (slot #7 or #8) had its VME LED stuck ON. I installed the RAY card with the 35 Ohm terminator for P1_TS(0) that I made on the last trip to Fermi. I did not install that 35 Ohm Terminator previously because I had thought that everything looked OK on the 53 MHz bus lines. The 53 MHz signal does look clearly different and better with the 35 Ohm terminator. Even lower Zo i.e. 30 Ohm or 25 Ohm might be better but this cuts into the Voltage swing (because the pull down current comes from 56 Ohm resistors) so I will leave it at 35 Ohm for now. Question, what all timing and control signals are actually used in M122 Top Crate ? vs what all timing and control signals are sent to the M122 Top Crate. Timing and Control Signals Delivered to the M122 Top Crate Backplane Timing & Control Carmen Master Used by Signal Clock Signal Foreign PBS ---------- ------------------------- ----------- P1_TS(0) PCLK 53 MHz Y (VME) P1_TS(1) TL0 Tick Clock Y P1_TS(2) TL1 TRM Clock N P1_TS(3) TL2 Beginning of Turn Y P1_TS(4) TL3 Interaction Marker N P1_TS(5) TL4 Cosmic Gap Marker N P1_TS(6) TL5 Sync Gap Marker N P1_TS(7) TL6 Spare Marker N P1_TS(10) Capture Monitor Data from Helper Y P1_TS(15) Reset Scalers from the Helper Y P1_TS(8,9, 11,12,13,14) No Signal N With this many spare drivers on the TOM we could put a double driver on the P1_TS(0) 53 MHz line and then bring the RAY terminator down to 25 Ohms or whatever is required to fully clean up this signal on the backplane. Another approach is to just install 25 Ohm pull downs on the TOM and let the 100316s put out a lot of current. They are rated to swing 25 Ohm bus. M122 Top ran for 24 hours with the 35 Ohm terminator on P1_TS(0) with no problems then just as I was getting ready to leave Wednesday afternoon I found it stuck. Slot #11 cards SN #31 (as usual) was stuck. I talked with Philippe about trying to issue the VME Sys Reset from here. I gave that up because I do not understand the VI documentation and I did not make notes about how this worked from 68k land. I will try to make this work at MSU first. Instead I played with the termination on the AS* VME line. I put a 150 Ohm resistor in series with a 112 pFd capacitor from the AS* line to GND. The 112 pFd cap is really two 56 pFd caps. They go to the two adjacent GND pins to AS*. AS* is 22C. 22B and 22D are GNDs. I put this on the back of the backplane at slot #1. I would have rather put it on slot #21 but that is occupied with the RAY terminator. In any case this should make some change to the AS* waveform. This should wack away at things faster than 150 Ohm x 112 pFd = 16.8 nsec and not effect the DC values. ------------------------------------------------------------------------------ DATE: 13:15-MAR-2007 At: Fermi TOPICS: Work on the M122 Top Crate VME Monitor Data Readout for the Foreign Per Bunch Scalers Tuesday, Work on the M122 Top Crate VME Monitor Data Readout for the Foreign Per Bunch Scalers Check the power supply Voltages right at the M122 Top Crate Backplane +5.050V +3.332V -2.004V -4.528V Before touching anything record some scope waveforms: 11:56 Ch1 is AS* Ch2 is Adrs_23 at Slot #11 12:00 Ch1 is AS* Ch2 is Adrs_21 at Slot #11 12:02 Ch1 is AS* Ch2 is DTACK* at Slot #11 12:18 Ch1 is pin A1 Ch2 is pin A2 i.e. TS0 Math is diff at Slot #11 12:21 Ch1 is pin A1 Ch2 is pin A2 i.e. TS0 Math is diff at Slot #20 These scope pictures are on scope disk #4. The address line has about 185 nsec of setup before the edge of AS* that says that a cycle is beginning. The Timing Signal #0 on pins A1 and A2 looks like a good sine waveform. The peak to peak amplitude of each side is about 105-120 mV at Slot #11 and about 110-125 mV at Slot #20. The full differential voltage is about 220 mVpp at Slot #11 and about 240mVpp at Slot #20. Also before pulling anything apart I tried a couple rounds of Configuring M122 Top to see if any extra VME LEDs were flashing during the Configuration process. There is one clear overlap: while Configuring slot 18 the card in slot 21 also flashes very solidly and continuously. This matches what I saw last August when I added the first 100 nsec of AS* delay to the M122 Top. See the Log Book entry for 10:12-AUG-2006. OK now pull out the VI Slave with its TOM card. I had the DAQ Shifter get settled down in a ZB run before doing this, warned him that once I started work on this TFW crate that he should not try to change triggers, and I stopped the Monitor Pool Filler. This worked OK - i.e. ZB ran OK while I worked on the TOM from M122 Top. I added a second 100 nsec of delay to the AS* signal (both edges). See the drawings in the front of the TFW FPGA notebook (Printronix blue book). For this delay line I picked up Gnd from C56 and C102 and I picked up Vcc from C50. After adding this 2nd delay I took two more scope pictures: 15:39 Ch1 is AS* Ch2 is Adrs_21 at Slot #11 15:43 Ch1 is AS* Ch2 is DTACK* at Slot #11 Things changed as expected - there is now 285 nsec of setup before AS* falls to its active Low Voltage state. So now M122 Top had 200 nsec of delay on the AS* coming out of the VI and the Routing Master has 200 nsec of delay on the DTACK* going into the RM's SBC. Doing a Configuration I still see slot 21 flash while slot 18 is being Configured (but it may not be flashing as much). The thing that is different about M122 Top Crate is that it has all 21 slots loaded with cards. None of the other TFW Crates are fully loaded. So in M122 Top both the VME Bus and the upper Timing Signal Bus are more heavily loaded than in other crates. I thought that this might cause trouble for the 53 MHz, which is on TS_0, because the Zo of this bus is lower, and thus it might benefit from a lower resistance terminator. Thus I have made a RAY terminator card that has TS_0 terminated with 35 Ohms instead of 39 Ohms like all the rest of the Timing Signal Bus lines are terminated. I will keep this modified RAY at D_Zero but I did NOT install it on this trip. Recall the recent history of M122 Top Crate: 11-AUG-2006 "Fixed" the problem of VME hangs by adding 100 nsec additional setup time before the AS* signal becomes active. 9-DEC-2006 First VME hang in 4 months 18-FEB-2007 Another VME hang in M122 Top crate 11 and 12 MAR-2007 Hangs on each of these days --> get in the car Another good entry in the log book to look at is 23:27-MAY-2006. This describes what the M122 Top Crate looks like when it is hung. " - The VME LED on the Scaler Card in slot #11 is always ON. - The AS* and DTACK* LEDs on the TOM are both flashing. " So the stuck card is not stuck in the sense that it has left its DTACK* asserted. Rather it is only stuck in the sense that it has its Data Bus drivers enabled. The only loops in the VHDL are: - waiting (without Board_Select asserted) for a cycle to begin and - waiting (without the Data Bus Drivers enabled) for DS* to go low and - waiting (with DTACK* asserted) for the cycle to end. From the symptoms the hang can not be in any of these loops. Could there be a problem with the "Error Exit" part of the VME state engine ? When the card is hung, a VME Sys_Reset does clear the problem and VME Sys_Reset only touches the VME state engine and the Interrupt Controller. Could the problem be in the Interrupt Controller ?? NO because the Interrupt Controller can not force Board_Select to be asserted YES beccause the Interrupt Control can for the Data Bus Drivers to be enabled. Talked with Dean about collecting the BLS signals at L2_Acpt time that he wants to see and about filtering in the ADF-2s. Checked all the fans and racks and such and things look OK. ------------------------------------------------------------------------------ DATE: 12-MAR-2007 At: MSU TOPICS: Foreign Per Bunch Scaler M122 Top Crate VME Monitor Data Readout Hang There were two more failures of the M122 Top crate Foreign Per Bunch Scaler readout over the weekend. Sunday early AM clips from the control room log book Sunday, March 11, 2007 0:44:28 CST: CAPTAIN/Log: 603888: Nirmalya Parua Channel 13 is showing D0 lumi at 0, Acnet is looking fine. Sunday, March 11, 2007 0:57:53 CST: DAQ/Log: 603892: Emanuel Strauss Luminosity in the Captain's Monitor stopped refreshing, but not in ACNET. Paged the lumi expert, eventually we had to reset one of the crates in rack 122. Card 31 was the source of the problem. **Comment by estrauss on Sunday, March 11, 2007 12:58:30 AM CST We ended up losing ~8 minutes of data taking Sunday, March 11, 2007 1:05:07 CST: LUM/Log: 603893: Yuji Enari TCC had a problem from LBN=5306618 to LBN=5306639. There is no luminosity info during this period. This problem has been fixed by resetting crate in M122 by DAQ shifter. Monday early AM note from captain Abid Patwa "At around 12:45 am (March 12th), while we were running in store 5274 and lum ~ 85e30, D0 luminosity dropped to around 0E30. I contacted MCR and they told me that one RF station at F0 tripped and D0 should investigate our luminosity counters. After contacting the on-call Lum Expert and reading the March 11, 12:57 am DAQ shifter logbook entry, the same situation appeared to happen last night also. The solution was to reset Rack 122, Card 31 (lum scalar rack). DAQ shifter did this and the luminosity was back up to the normal 80e30 reading (consistent with lum at CDF)." "I contacted MCR and they noted that beam is stable and that RF station will remain off (their normal procedure). I also contacted Bill to explain the situation. In any case, the culprit looks to be something going on with Rack 122, Card 31. Would you be able to look at this? The problem appears to have happened two consecutive nights..." ------------------------------------------------------------------------------ DATE: 7,8-MAR-2007 At: Fermi TOPICS: Collaboration Meeting, Check over MCH-1, Get NEMA and Cu Boxes from PAB, L2 Global Sync Errors Checked all the visible fans and such in MCH-1. All looks OK. Did not open the racks to check the rear blowers, the transitions area fans on SCL Hub or TFW Readout, or the TFW TCC VME Crate Fan. Need to check those on a trip when I can open things up between Stores. L2 Global has been throwing some errors where it sees an L2_Decision come from the TFW that does not match what it expects to see based on the L2 Answers that it sent to the TFW. When this happens the DAQ system hangs. Typically DAQAI sees this and figures out the problem and requests and SCL Init and then things run OK again. Typically this errors come in bursts of 3 to 5 of them within a minute or two. Typically this happens once or twice during a 24 hour Store. I got to watch one of these bursts. I watched the SCL Hub-End to see if it had flashing LEDs and I watched the TFW readout VRBs to see if any of them were having G-Link sync problems. I saw no indication of a problem at that level. The L2 Global cpu does write something into its log file when this happens but from what I can understand the information that it writes does NOT include the mask of L1 Specific Trigs fired that it received from the TFW or the mask of L2 Answers that it sent back to the TFW. I did check which fibers are split in the TFW HSRO. The Tick_&_Turn Scaler M123 Bottom Slot 21 and the L1_Mask FOM++ M123 Middle Slot 16 are the two cards that have their HSRO fiber split and driving both their VRB and the L2 Global Crate FIC. ------------------------------------------------------------------------------ DATE: 18-FEB-2007 At: MSU action at Fermi TOPICS: M122 Top Crate VME Hang Sunday morning Feb 18th at about 4AM the Control Room paged because there were error messages when they tried to Initialize the TFW before the Store that was just then about to go in. I asked them to check M122 Top crate and they saw a VME LED stuck ON and so they pushed the VME Reset button on the slot #1 card in that crate. The TFW then Initialized without errors. We had not been in a Store for about the past 16 hours so I assume that the VME readout of the Per Bunch Scalers hung at some point during this time when there was no Store. The previous M122 Top Crate VME hand was on 9-DEC-2006. ------------------------------------------------------------------------------ DATE: 9-FEB-2007 At: MSU TOPICS: MSU Wiener Crate Inventory The following is the inventory information about the Wiener Crates that are currently at MSU. Production Test Crate Crate UEV6023 SN# 2397015 Power Supply Chassis UEP6021 SN# 2497072 Fan Tray UEL6020A SN# 0697029 Saclay Crate Crate UEV6023 SN# 4198018 Power Supply Chassis UEP6021 SN# 3998075 Fan Tray UEL6020 SN# 4098012 The Saclay Crate came back from Saclay in the summer of 2006. Its power supply was sent to Wiener and modified so that it is now setup like all of the other L1 Cal Trig Wiener power supplies, i.e. +5V and +3.3V digital and +5V and -5V analog See the log book entry from 1:3-FEB-2006 for the inventory list of the Wiener equipment in the running system in MCH-1. The spare ADF Crate Power Supply Chassis stored in the spares cabinet at D-Zero is Wiener model number UEP6021 serial number SN# 5196006. The spare ADF Crate Fan Tray stored in the spares cabinet at D-Zero is Wiener model number UEL6020A serial number SN# 2296001. ------------------------------------------------------------------------------ DATE: 11,12-JAN-2007 At: Fermi TOPICS: Install SCL Status Cable for Geo Section $7D, Replace ATC card in ADF Crate Installed a SCL Status Cable for Geographic Section $7D, i.e. the L1 Cal Trig TAB-GAB VME-SCL Geographic Section. Friday About 8:30 AM the L1 Cal Trig again had Parity errors (see the 7-Jan-2007 log book entry). The problem was associated with the LVDS link from ADF Crate D Slot 20, i.e. it stayed with the crate slot and not with the ADF-2 card which had been swapped between slots 20 and 21 in that crate on Jan 7th. As seen on the L1 Cal Trig Monitor Display Gui this is called Parity error for bit 2 on TAB Board 5. So Selcuk and Mike decided to change the ATC card. They pulled from ADF Crate D Slot 20 ATC card SN#74 and installed ATC card SN# Spare6. ------------------------------------------------------------------------------ DATE: 7-JAN-2007 At: MSU action at Fermi TOPICS: ADF-2 to TAB Parity Error On and off for the past couple of days there had been some trouble with a TAB card showing a Parity error on the LVDS link from the ADF-2 card in ADF Crate "D" slot 20. People had tried a different cable and they thought that fixed the problem but then the Parity error came back late Sunday afternoon or evening. Philippe had the pager and he had them check things over carefully before letting people get too excited about changing the ADF-2 card. They un-plugged a set of LVDS cables, I think swapped all 6 cables from slot 20 and 21 in ADF Crate D, then changed their minds and put back all of the LVDS cables the way that they originally were. At that point everything was working OK again. Norm was Captain and he had one other good idea, i.e. swap the ADF-2 cards in slots 20 and 21 in ADF Crate D, and that way we may end up learning something from all of this. So with Philippe's help they swapped ADF-2 Cards in Crate D between slots 20 and 21. It had been: slot 20 ADF-2 card SN# D31 slot 21 ADF-2 card SN# D32 Now it is: slot 20 ADF-2 card SN# D32 slot 21 ADF-2 card SN# D31 After swapping ADF-2 cards between slots 20 and 21 then everything was running OK for the next 4 or 5 days. The folks at Fermi also had trouble finding the spare components for the ADF System. A note describing the ADF System spares was presented at the L1 Cal Trig video meeting on 9th Jan 2007 and put in the Card and Crates section of the ADF System documentation on the web.