D-Zero Hall L1 Framework and L1 Calorimeter Trigger Logbook
     ---------------------------------------------------------------
               Log book for 1993 is in D0_HALL_LOGBOOK.LBK_1993


Date:                  At:        Topics:

..............................................................................

Date: 30-DEC-1994     At: MSU           TCC Problem

    - Jan (just back from vacation) called
      They were trying a new COOR (which was unrelated to this problem, but
      caused a request for an initialize), after the initialize,
      TRGMON stayed stale, with no sptrg #31, no lights on the L1FW

      This is identical  to  25-MAY-1994,12-JUL-1994 and probably 13-JUN-1994 

    Trying to do a directory on TCC produces the error message:
%DIRECT-E-OPENIN, error opening D0HTCC::DUA0:[TRIGGER]TRICS_*.LOG; as input
-RMS-F-NET, network operation failed at remote node; DAP code = 01F77C54

    TRICS had problem writing to its logfile at one time, and
    switched mode: to fly with NO logfile. But TCC still tries to get its input
    files (init_auxi, reset,...) from the disk. 

    There was no particular rush today, so Philippe tried to see if he could
    find a possible "emergency recovery" action (in case this happen in the
    middle of a run, and we don't want to loose scaler information). 
    So Ph. changed the variable holding the location of TRICS's command files.
    
    Philippe then told TCC to initialize, and it did well with the L1FW (lights
    flashed, and sptrg #31 appeared) but the initialization got in trouble when
    it reached the L1.5CT and the load from_local_disk command. The
    initialization seemed to hang, but was just slow, as it had to timeout each
    of the 12 EXE file OPEN.

    Reboot TCC and everything looks normal. There was an access after that.

    The disk loss probably occured around 6 am (last entry in MPOOL_SERVER.LOG)
    TCC didn't try reaching its disk (after giving up on writing logfiles) and
    there was no problems or symptoms for COOR, as long as no request for
    initialization, or begin/end run file was sent.

    For some of the earlier entries, we heard that there had been other network
    problems at DZero, and Jan had to reboot a bunch of L2 nodes around the same
    time.  But this time it didn't appear so, at least there was no entries in
    the logbooks...
    
..............................................................................

Date: 23-24-DEC-1994  At: DZero Topics: Deliver the cost estimates and
                                descriptions of the Run II equipment to Jim and
                                have a meeting with him about all of this,  run
                                some Cal Trig Random Cell tests.

Deliver the Dec94 upgrade description and cost estimate to Jim Christenson
and talk with him about it for 15 or 20 minutes.  Mail all the 5 files to
Mike Tuts at Mikes request.

Between Stores, Run some tests of the L1 Cal Trigger.

Setting up with eta 1:20 and all options except  do not check the
Framework terms;  then twice start seeing errors after about 4k loops
of CalTrig_Random.   All of the errors are Px Py off by 4096.  Note
that the system was not locked on LU page 4 like it was last week.
If the same loop was asked for again, the same error always occured again.

S-HTT/PAR%rand% Loop 1000/900000, Error Count is 0
S-HTT/PAR%rand% Loop 2000/900000, Error Count is 0
S-HTT/PAR%rand% Loop 3000/900000, Error Count is 0
S-HTT/PAR%rand% Loop 4000/900000, Error Count is 0
                                       %% time: 23-DEC-1994 14:47:49.10

    Global Py Momentum Sum is -490 instead of -4586, T1 trunc = 4096
    Pick was 218,HD,POS,E_20,P_5,LUP_8-2-7-7,EMET_REF,REF_0,244,CMP_3
    Loop 4329/900000, Error Count 1. Continue? Same Loop? ReSynch?

    Global Py Momentum Sum is -589 instead of -4685, T1 trunc = 4096
    Pick was 195,HD,POS,E_11,P_28,LUP_2-8-7-7,TOTET_REF,REF_3,128,CMP_3
    Loop 4356/900000, Error Count 2. Continue? Same Loop? ReSynch?

    Global Py Momentum Sum is -999 instead of 3097, T1 trunc = 0
    Pick was 86,HD,NEG,E_13,P_13,LUP_7-8-7-7,TOTET_REF,REF_1,78,CMP_2
    Loop 4427/900000, Error Count 3. Continue? Same Loop? ReSynch?

Start over and Try again                 .
S-HTT/PAR%rand% Loop 1000/900000, Error Count is 0
S-HTT/PAR%rand% Loop 2000/900000, Error Count is 0
S-HTT/PAR%rand% Loop 3000/900000, Error Count is 0
S-HTT/PAR%rand% Loop 4000/900000, Error Count is 0
                                       %% time: 23-DEC-1994 14:51:59.89

    Global Px Momentum Sum is -353 instead of -4449, T1 trunc = 4096
    Pick was 240,EM,NEG,E_17,P_8,LUP_7-3-7-7,EMET_REF,REF_2,112,CMP_3
    Loop 4231/900000, Error Count 1. Continue? Same Loop? ReSynch?

    Global Py Momentum Sum is -243 instead of -4339, T1 trunc = 4096
    Pick was 67,HD,NEG,E_8,P_3,LUP_5-4-7-7,HDET_VETO,REF_0,240,CMP_1
    Loop 4444/900000, Error Count 2. Continue? Same Loop? ReSynch?

    Global Py Momentum Sum is -67 instead of 4029, T1 trunc = 0
    Pick was 169,EM,POS,E_1,P_22,LUP_8-5-7-7,HDET_VETO,REF_0,36,CMP_3
    Loop 4533/900000, Error Count 3. Continue? Same Loop? ReSynch?

    Global Px Momentum Sum is 544 instead of 4640, T1 trunc = 0
    Pick was 188,HD,POS,E_19,P_31,LUP_6-2-7-7,EMET_REF,REF_0,192,CMP_3
    Loop 4582/900000, Error Count 4. Continue? Same Loop? ReSynch?

    Global Px Momentum Sum is 2027 instead of -2069, T1 trunc = 4096
    Global Py Momentum Sum is 619 instead of -3477, T1 trunc = 4096
    Pick was 201,HD,NEG,E_15,P_12,LUP_3-7-7-7,TOTET_REF,REF_2,191,CMP_1
    Loop 4880/900000, Error Count 5. Continue? Same Loop? ReSynch?

Note that these errors are:  "Global Px Momentum Sum"  and "Global Px Momentum
Sum" errors.   Last week we saw "Cell Px" errors.   What goes on at high eta
that is funny and can cause errors of 4096 when you allow the LU page to move
around??

Give up on the above attack and try working in just eta 1:16.
Setting up with eta 1:16 and all options except do Not check the Framework
terms then we run 600k loops of CalTrig_Random with zero errors.

Now try locking on LU page #4.
Setting up with eta 1:20 and all options except: lock on LU page #4 and
do Not check the Framework terms;  then twice run 100k loops of
CalTrig_Random with zero errors.
    S-HTT/PAR%rand% Loop 100000/100000, Error Count is 0
    S-HTT/PAR%rand% Loop 100000/100000, Error Count is 0
..............................................................................

Date: 15-16-DEC-1994  At: DZero Topics: Fix CTFE ref supply at eta=-9:-12 phi=16
                                        Reset pedestals for this CTFE
                                        Run random test (errors in Px card)
                                        Install a timing marker cable in M102
                                        Reboot TCC, Atlas meetings, ECB meeting
                                        about central detectors, talk with Mike
                                        Matulik and Marvin Johnson.

    Investigate pedestal drift problem on CTFE card at eta=-9:-12 phi=16.
The problem is traced back to a ceramic bypass cap in the distribution network
for the the -1 V ref supply to the ADC for the channel at eta = -12.  This card
SN#326 was fixed and put back into the system.  The bypass cap that shorted was
on U294.

    After returning to the old Init_DAC_Bytes.LSM file, the pedestals for this
card are now around 10 ADC counts. It must be that this bad capacitor had been
sick for a while.  Run find_dac on this card to find the correct new values and
update Init_DAC_Bytes.LSM.   Copy the new Init_DAC_Bytes.LSM file to MSU.
The "special"  Init_DAC_Bytes.LSM  file that had been used for one or two days
to compensate for the CTFE "drift" was deleted.

    Run 100,000 loops of random test on eta 1:16 and all phis, all pages, all
reference sets, including the large tile andor term test in the L1 FW.
 --> No error detected.

    Run loops of random test on the full eta 1:20 coverage, limited to lookup
page #4 (and without checking the andor terms). A series of errors appear, all
centered around the Tier #1 Px card in crate +17:20 ; 25:32.

E-HRD/TST%rand% Cell Px =2620, CAT2 Thrsh CMP_0 =2620 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:24:38.94
E-HRD/TST%rand% Pick was 209,EM,POS,E_18,P_31,LUP_4-4-4-4,TOTET_REF,REF_3,108,CMP_0                 %% time: 15-DEC-1994 19:24:39.05
P-HTT/PAR%rand% Loop 17193/100000, Error Count 1. Continue? Same Loop? ReSynch?                     %% time: 15-DEC-1994 19:24:39.15
(error not repeated by redoing same loop)

E-HRD/TST%rand% Cell Px =2681, CAT2 Thrsh CMP_0 =2681 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:28:17.62
E-HRD/TST%rand% Cell Px =2681, CAT2 Thrsh CMP_1 =2582 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:28:17.72
E-HRD/TST%rand% Cell Px =2681, CAT2 Thrsh CMP_2 =2587 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:28:17.83
E-HRD/TST%rand% Cell Px =2681, CAT2 Thrsh CMP_3 =2575 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:28:17.93
E-HRD/TST%rand% Pick was 238,EM,POS,E_18,P_32,LUP_4-4-4-4,EMET_REF,REF_3,8,CMP_0                    %% time: 15-DEC-1994 19:28:18.03
P-HTT/PAR%rand% Loop 18423/100000, Error Count 2. Continue? Same Loop? ReSynch?                     %% time: 15-DEC-1994 19:28:18.13
(error not repeated by redoing same loop)

E-HRD/TST%rand% Cell Px =2612, CAT2 Thrsh CMP_3 =2612 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:33:44.57
E-HRD/TST%rand% Pick was 39,EM,POS,E_19,P_32,LUP_4-4-4-4,EMET_REF,REF_1,31,CMP_3                    %% time: 15-DEC-1994 19:33:44.68
P-HTT/PAR%rand% Loop 19326/100000, Error Count 3. Continue? Same Loop? ReSynch?                     %% time: 15-DEC-1994 19:33:44.77
(error repeated 4 times by redoing same loop, then not repeated after 6 more)

E-HRD/TST%rand% Cell Px =2612, CAT2 Thrsh CMP_0 =2597 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:34:39.80
E-HRD/TST%rand% Cell Px =2612, CAT2 Thrsh CMP_1 =2612 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:34:39.90
E-HRD/TST%rand% Cell Px =2612, CAT2 Thrsh CMP_2 =2602 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:34:40.01
E-HRD/TST%rand% Cell Px =2612, CAT2 Thrsh CMP_3 =2587 but comp bit#5:3 =101                         %% time: 15-DEC-1994 19:34:40.11
E-HRD/TST%rand% Pick was 143,HD,POS,E_18,P_27,LUP_4-4-4-4,HDET_VETO,REF_3,182,CMP_1                 %% time: 15-DEC-1994 19:34:40.22
P-HTT/PAR%rand% Loop 20777/100000, Error Count 4. Continue? Same Loop? ReSynch?                     %% time: 15-DEC-1994 19:34:40.32
(we did not try to repeat this one)

In all this cases, the card claims that its sum is "smaller" than the comparator
threshold, when it should say "equal" or "greater".
The front LED matched what TCC read out (LEDs off).

Dan shoved on the card (which "didn't move"). But we had time to run another
20,000 loops and there were no more errors.

    Dan Installed a timing marker cable for K.Johns. This is LEMO cable
sticking out of the door of rack M102. This cable is plugged in a CTMBD for the
Start Digitization DIGIMEM backplane of this rack, and shows the IML Latch
Clock signal, which is MTG TSS #2, and CTMBD monitor signal #F.  We were able
to install this cable without turning power off.  We opened the front door of
M102 only about 5", with Steve carefully watching the air flow sensor, and
plugged the cable into the CTMBD.  Checked this signal against the BC T0 timing
reference from Carmen's Master Clock and all looked fine.

    TCC was rebooted before it was returned to the shifters for the next quiet
time.

    ATLAS meeting at Argone with video link to UCI on the 15th.  Meeting on the
16th with a failed video link to CERN.

Talk with Mike Matulik about SLIM and give him the first written information
about SLIM.  Talk with Marvin about Mike working on SLIM and about using SAR
in our new L1 Run II equipment, and about FPGA's, and about VHDL of the Run
II equipment.
..............................................................................

Date:  14-DEC-1994   At: MSU    Topics: Edit the  Init_DAC_Bytes.LSM  file

Edit the  Init_DAC_Bytes.LSM  file to compensate for all channels on the CTFE
card  eta -9:-12  phi 16  having pedestals that have drifted down from 8 to 4.
The following changes were made:
        -9,16 EM  move DAC pedestal from  38  to  49.
        -9,16 HD  move DAC pedestal from  38  to  49.
       -10,16 EM  move DAC pedestal from  33  to  44.
       -10,16 HD  move DAC pedestal from  30  to  41.
       -11,16 EM  move DAC pedestal from  27  to  38.
       -11,16 HD  move DAC pedestal from  38  to  49.
       -12,16 EM  move DAC pedestal from  37  to  48.
       -12,16 HD  move DAC pedestal from  43  to  54. can't work with 54 so 49

In TrgCur:  at Fermi there are now two  Init_DAC_Bytes.LSM  files.
Ver 3 is the one that was in use up until tonight.  Ver 6 is the temporary
one that we will use until this CTFE is fixed.   Only the ver 6 file is
in D0HTCC::DUA0:[Trigger].   This temporary ver 6 file was NOT copied to MSU.

For the -12,16 HD TT  I wanted to use a DAC pedestal value of 54 but the TRICS
READ_LOAD  DAC Pedestal file function just said  "Bad_Failure" when I had this
big of a value in the pedestal file.   So I moved 54 to 49.

The  Compare_DAC_Byte.Exe  would also not run when the value 54 was in the
file.

The conversion slope of this CTFE looks OK,  i.e. when I load 255 in the
pedestal DAC then the ADC reads something like 78 or 80.  So for now I assume
that it is just a "drift" in the offset reference supply and that the +- 1 Volt
ADC supplies are still operating OK.
..............................................................................

Date:  12-DEC-1994   At: MSU/   Topics: "Test-load" EM_Fraction L1.5 CT
                         Fermi          DSP code, Calls from John Butler with
                                        questions about the End of Run files.

EM_Fraction (Tool #3) DSP code was test-loaded between stores #5270 and
#5271.  The configuration was:

    Term    EM Et 1x2   Isolation   EM_Fraction     Global      L1 Spec
    Number  Threshold   Threshold   Threshold       Cnt Thresh  Triggers
    ------  ---------   ---------   -----------     ----------  --------
        0    3.0 GeV        0.1         0.1             1       1 2
        1    3.0 GeV        0.2         0.1             1       3

i.e. a "goof-off" or testing configuration which has nothing to do with
global running.  The typical type of configuration file errors were
found and fixed.  Bill Cobau made the configuration files.

Dan Owen collected ~200 "noise" events using this configuration (with
Pass_one_of (100) ).

This code was removed before global running on store #5271 began.
Steve verified (via READLOG) that the global run for store #5271 used
the correct (old) L1.5 Cal Trig code, default parameters, and COOR
parameters, even though the Framework had been initialized while Steve
was swapping files on D0HTCC to return to the old L1.5 CT operation.
TRGMON also showed rates and programming compatible with the old operation.

END of RUN File Questions
In the evening, Captain J.Butler calls Philippe about problems with the end of
run summary for run #86834 (N.Amos couldn't be reached, and we were the last
ones to mess with the trigger). Philippe looks in TCC's logfile and sees
nothing too particular. The sequence of coor messages was:
WRT_HOST  BEG_RUN        LOGGER$BRD:TCC_BEGIN_0086834.INFO %% time: 16:49:24.90
WRT_HOST PAUS_RUN    LOGGER$BRD:TCC_PAUSE_0086834_081.INFO %% time: 17:30:03.46
WRT_HOST PAUS_RUN    LOGGER$BRD:TCC_PAUSE_0086834_082.INFO %% time: 17:30:06.84
WRT_HOST RESU_RUN    LOGGER$BRD:TCC_RESUM_0086834_036.INFO %% time: 17:32:26.59
WRT_HOST  END_RUN          LOGGER$BRD:TCC_END_0086834.INFO %% time: 17:43:25.85

Note that the pause and resume sub-numbering seems to follow a sequence
that is not reset with run numbers, and independent of each other, but looking
in LOGGER$BRD, this seems to be standard. There is also two PAUSEs for one
RESUME (COOR only knows why!), but this shouldn't be a problem. The run lasted
about 1 hour (54 mn).

Philippe found the Begin and End Run files in COPYCFG$ARCHIVE, and checked that
the beam crossing numbers didn't roll over. Computing the number of beam
crossing elapsed gives 911 E6 which matches the 286.3 * 60 * 54.
Computing the livetime (enable sptrg#30/beam X) gives 91.3 %, which is typical,
unlike the reading that Butler had of >98%, which propagated to other
numbers in the run summary.

Philippe told John B. that there was nothing really wrong and that they should
indeed go after Norm to recover this good run. The following run gave a more
normal run summary.
..............................................................................

Date:  2-DEC-1994   At: MSU    Topics: MTG PROM Files: Verify current versions,
                                       Delete old Timing_Specification_Files,
                                       Need to investigate the problem with the
                                       FE_Busy_4A TSF file,  Copy some
                                       important information from the paper
                                       log book so that we have it at MSU or D0

The following is a list of the MTG PROM File Versions that should now be
running in the system at D-Zero

 MTG
 Ch's     L1 Cal Trig MTG     Direct-In-Test-Trig MTG      ERPB MTG
-----     ---------------     -----------------------      --------
 1-8            1M                      1K                    1C
 9-16           2L                      1K                    2A
17-24           3K                      1K                    2A
25-32           4M                      1K                    2A


 MTG
 Ch's     FE-Busy MTG     L1 Framework MTG     Hold Transfer MTG
-----     -----------     ----------------     -----------------
 1-8           -                 1R                   1L
 9-16          -                 2L                   1L
17-24          -                 3M                   1L
25-32         4A                 4L                   4L


 MTG
 Ch's     L15 FW Control MTG     L15 FW Receive MTG     L15 Veto Conf MTG
-----     ------------------     ------------------     -----------------
 1-8              1A                     1B                    1B
 9-16             2A                     1B                    1B
17-24             3A                     1B                    1B
25-32             4A                     1B                    1B


 MTG
 Ch's     Start Digitize MTG
-----     ------------------
 1-8             1L
 9-16            1L
17-24            1L
25-32            4M


Today the following old versions of  Timing_Specification_Files  were
deleted both at MSU and Fermi:

   L1 Cal Trig files:  1L, 2K, 4L
   ERPB MTG files:     1A, 1B

Note that L1 Framework MTG file  3N  is being kept in case we need to run
the COMINT with only 6 clocks between beam crossings.

There appears to be a problem with the  FE_Busy_MTG_PROM_4_SN_4A  Timing_Spec
file.  At the very least the signal names in this TSF file are wrong.  And
it appears the the actually up-down tick times are also wrong.  Figure out
what is wrong, fix this file, check it against the running part.  See page
#20 of the  L1 & L15 Framework paper log book #2.   I think that this signal
should be up for two ticks starting with tick 108.


Information from L1 L15 Framework Log Book #2
---------------------------------------------
The following is the list of signals on the two M101 <--> M114 Cables

M101 <--> M114 Cable Number 1    log book page #18
--------------------------------------------------
Signal
Pair              Function
------  -----------------------
  1      Data Block Builder Busy \
  2      68k Prepair Data        |
  3      68k Display State       |  signals to the scalers in M101
  4      Wait Slave Ready        |
  5      VBD Run DMA List        |
  6      Wait Find VBD Buffer    /
  7      L15 Stretch signal, L15 Control to the M114 MTG's
  8      L15 Stretch signal, L15 Control to the M114 MTG's
  9      L15 Potential \
 10      L15 Skip      |  signals to the scalers in the bottom of M114
 11      L15 Cycle     /
 12      NC
 13      NC
 14      NC
 15      Special I/O MTG Ch #25 Output "L15 Operational" to And-Or Term #110.
 16      NC
 17      NC


M101 <--> M114 Cable Number 2       log book page #17
-----------------------------------------------------
Signal
Pair              Function
------  -----------------------
  1     Spec Trig #30 FSTD Output Live Beam X Clock to M114 Live Beam X Scalers.
  2     Spec Trig #30 FSTD Output Live Beam X Clock to M114 Live Beam X Scalers.
  3     NC
  4     NC
  5     NC
  6     NC
  7     NC
  8     AND-OR Input Term #71   L0_Slow_Inter    from L0 to the AND-OR Network
  9     AND-OR Input Term #72   L0_Slow_Z_Good   from L0 to the AND-OR Network
 10     AND-OR Input Term #73   L0_MI_Flag_0     from L0 to the AND-OR Network
 11     AND-OR Input Term #74   L0_MI_Flag_1     from L0 to the AND-OR Network
 12     AND-OR Input Term #75   L0_MI_Flag_2     from L0 to the AND-OR Network
 13     AND-OR Input Term #76   L0_MI_Flag_3     from L0 to the AND-OR Network
 14     AND-OR Input Term #77   L0_Slow_Z_Center from L0 to the AND-OR Network
 15     NC
 16     NC
 17    L0 Direct-In-Test-Trigger


Front connections to the Framework Main Timing MTG    log book page #21
-----------------------------------------------------------------------

   Term = B2 = B7 x B8 x B9 x B13 x B14 = B15   <-- Level 15 Stretch

   Term = E14 = E15   <-- Latched Global Specific Trigger Fired

   Term = B5  <-- COMINT Write A/B Control

   Clear Most Recent BAR <--   B4 x Term = E5   <-- Clear Most Recent

   Term = E4  <-- Front-End Busy BAR

   Term = B3  <-- COMINT Read A/B Control

   Latched Global Specific Trig Fired -->  E17 = E18 = E19 = E20 = E21 = Term

   Level 15 Stretch -->  B17 = B18 x B19 = B20 = B21 = Term = E24


         Input          Signal
         -----  -------------------------------------
           B3   COMINT Read A/B Control
           B4   Clear Most Recent BAR  output
           E4   Front-End Busy BAR
           E5   Clear Most Recent  input
           B5   COMINT Write A/B Control
          E15   Latched Global Specific Trigger Fired
          B15   Level 15 Stretch
          E17   Latched Global Specific Trigger Fired
          B17   Level 15 Stretch


Front connections to the L1 Calorimeter Trigger MTG    log book page #21
------------------------------------------------------------------------

         Input          Signal
         -----  -------------------------------------
           B4   Read A/B Control
           E5   Front-End Busy
           B6   Write A/B Control
          E29   Front-End Busy BAR
          B29   Clear Most Recent BAR
..............................................................................

Date:  29,30-NOV-1994  At: Fermi  Topics: Water Leak in M103-M104 radiator,
           1-DEC-1994             Look at TT +6,23 EM which has been excluded,
                                  DAC Pedestals for TT's +17,15 and +18,15,
                                  Added more sensors to the RPSS,  Work on
                                 getting Pulser runs for L15CT_PROV,

Solder over the pin hole in the "U" tube in the bottom radiator between M103
and M104.  The soldering when well because the radiator was blown out with
building air and the area to be soldered was cleaned with both sand paper and
a steel brush.  This radiator appeared to have a dimple in it right where the
pin hole was.  This dimple is thought to be caused during the brazing process
by some one who cooks the thin Cu "U" tube too much.  Smoke from the soldering
brought the VESDA up to a 4, i.e. just short of an alarm.

The Drip Detector Strip under this radiator had to be taken out and cleaned
to get the corrosion off of it.

The drip detector function of the RMI was re-enabled.  This required turning
on both the Drip Detector "Sensor Input" and the Drip Detector "Local Alarm"
switches back on.

In rack M110 I installed an Air Flow sensor and a 95 deg F Temperature sensor.
These are connected to appropreate input to the RPSS.  I connected a "RPSS"
cable and ran it over to the back of M101 for use with the PhotoHelic
differential air pressure guage.  See the new file TrgHard:[RPSS]PhotoHelic_to_
RPSS.txt for more details about this.  I put the appropriate labels on the RPSS.

Look at TT +6,23 EM which has been excluded since Sept shutdown
+6,23 EM is rack M105,  CBus = 1,  MBA = 170,  CA = 44:45,  "2nd" EM channel
on the CTFE card,  Clock Control Register is FA = 81,  bit of value 4 controls
this EM channel.
Well, it still looks noisy on the scope and running a L1 Cal Trig with a 3 GeV
EM Ref Set threshold everywhere you get about  0.6 Hz  if +6,23 EM is excluded
and about  5 to 6 Hz  if +6,23 EM is not exclude.
The decision is to leave it excluded.  Should we cut the resistor ?
We looked at the Examine from this run and all the noise is coming from
eta,phi,depth  11, 45, 2  in the Calorimeter.

For the last couple of weeks in the Physics run Examines, Trigger Towers
Eta +17 and +18 Phi 15 both EM and HD have looked a little hot or noisy.
Looking at TrgMon ADC counts when there is no beam, this just looks like a
CTFE pedestal "drift".  I play human histogram and by hand make the following
changes to Init_DAC_Bytes.LSM:
           +17,15 EM  move DAC pedestal from  34  to  32.
           +17,15 HD  move DAC pedestal from  31  to  29.
           +18,15 EM  move DAC pedestal from  34  to  33.
Copy the new  Init_DAC_Bytes.LSM  to TCC and have TRICS load it in.  Also copy
Init_DAC_Bytes.LSM  to MSU.
Note that this is the second time in the last couple of months when we have
had a couple of channels on a single CTFE all "drift" at once.  Jan used the
compare on the new CALIB (which now looks at L1 Cal Trig data) to verify that
she could see these channels move.

Work with Jan and company to get the Pulser Runs for L15CT_PROV.  Get the LOW
amplitude run by going back to the single L1 trigger version of the CONFIG
files and using TRICS to tell L15 FW that the L1 trigger does not require
L15CT confirmation.  This is in  DATA3:[CAL]CALOR_086328_01.X_ZRD01
OK, finally get a L15CT pulser run at High amplitude.  This is in the file
DATA3:[CAL]CALOR_086419_01.X_ZRD01.  This was also done by hand using TRICS
to turn off the L15 FW so that the Cal Pulser would increment.   Still need
to get the Config files working.
..............................................................................

Date:  25,26-NOV-1994  At: Fermi  Topics: Air Flow Sensor un-Tied Down, Water
                                          Leak in M103-M104 radiator,  Joan
                                          has a couple of channels for us to
                                          look at,  RPSS print set.

About 6 AM on the 25th the call finally comes that they have found the water
leak.  It is showing up in the "Dan Owen drip detector" behind M104.  It
appears to be controlled by the Mud Flap so we decide to leave the system
running.  I start out for Fermi.   At Fermi I look at it while the system is
running.  All the drips from the end of the mud flap appear to be going onto
the drip detector and then into the channel between the racks and then down
between and out of the racks.  This is at the radiator between M103 and M104.

During the 4AM shot setup on the 26th.  Pull off the shockless system G10 and
the bottom mud flap between M103 and M104.  The leak is from the very bottom
turn around "U" tube on the bottom radiator on the M104 side of this radiator
on the bottom surface of the "U" tube.  It is about 1/8" of an inch into the
"U" tube from the but weld.  It is on the surface of the "U" tube that was
stretched when the "U" tube was manufactured.  One can see stress marks in this
section of the "U" tube.

I packed paper towels around this part of the radiator to try to make sure
that the "spray" was converted into a "flow" and then reinstalled the mud
flaps and the shockless system G10.

During the 10 hour shutdown on the 29th an attempt can be made to solder over
this section of the "U" tube.  It is not corroded too badly at this time.  The
hole is far enough from the but weld that the area can be cleaned.  The above
work took 2 hours (i.e. the full time of shot setup).  Estimate 4 hours for
the solder job.  Need hose splice and compressed air.

Remove the tie down from the air flow sensor at the input to M102.   Did not
boot TCC and it came up just fine.  Is Philippe's relatively new idea of
restarting zeller at TRICS Init time curing the problem of TCC not talking
to the CBus's after a long power off time?

For the last couple of weeks Joan says that Trigger Towers  Eta +17 and +18
Phi 15 both EM and HD have looked a little hot or noisy.  I wanted to look
at there pedestals during the shot setup that just passed but did not have
time.  We need to look at these two towers.

There does not appear to be an RPSS print set at D-Zero; need to bring on here.
..............................................................................

Date:  23-NOV-1994  At: MSU   Topics: Air Flow Sensor Tied Down,
                                      Water Drip Power Trip of L1

 ----->  M101 - M102 Air Flow Sensor is still TIED DOWN  RPSS Sensor  <-----

16:53 CST
All L1 racks power down because the RMI has detected a water drip.

about 18:05 EST:  Steve is called at MSU by Marcel and is told that L1 has
powered down.  He is told that RPSS has detected a water FLOW problem.
Steve gives Marcel Dan's home telephone number (and also Steve's home
telephone number).  Steve tries to call Dan at home but receives no answer.

18:15 EST
Edmunds is called at home by Jan and told that Detector Shifters and Joan are
investigating looking for the water leak.  Edmunds suggests places to look
for the water.   Edmunds is told that he will be called as soon as the
search for the leak is complete and that he will be kept informed.  Edmunds
and Jan discuss the possibility to turn off the drip detector part of the
RMI and leave the rest of RPSS running.

About 18:17 EST
Steve calls Joan in control room.  He is told by Joan that Jan is on the phone
with Dan, and that Dan has been informed of the current situation, and that
Dan has suggested turning off drip detector.  Steve tells Joan that Steve
will be watching at MSU for a while longer.  At the conclusion of this
short telephone conversation Steve is under the (mistaken) assumption that,
since Jan and Dan have been in contact, Dan is "in the loop."  Steve does
NOT call Dan.  Steve has no further contact with D0 Control Room.

About 17:25 CST
Someone (not Edmunds) takes the decisions:
   There is no water leak.
   It is OK to turn off the RMI drip detector.
   It is OK to Power Up L1.
This decision was taken in D0 Control Room with no telephone call to MSU.

17:30 CST
L1 is Initialized.  It has a couple of problems. In a mail message at 19:09 EST
Steve reports on the problems:

    (1) after the 1st INITIAL, a CTFE comparator (Total Et Ref Set 0
        at -18,21) read back 254 after being programmed to 255.

    (2) after the 1st download, none of the muon Specific Triggers
        had any And-Or rate.

    (3) after the 2nd download, first Geographic Section 0 was 100%
        Front-End Busy, then Geo Sect 0-13 (with the exception of 1
        and 5) thrashed around a lot.

I do not understand anything about problem 1.  Problems 2 and 3 (with the
exception of Geo Sec #0) are standard problems after the L1 FW has been
powered off.  It was not necessary to power cycle HTCC and its BA23.

In a mail message at 20:04 Jan reports that the RMI drip detector has been
disabled, L1 has been powered up, and "We didn't have any problems".

At about 20:30  Edmunds having never heard anything from anyone comes to
Physics Dept so that he can use a tube and the phone and not miss any in
coming calls.  He learns that all has been running since 15 minutes after
the first and only call to him.  Not clear who took the decision that all
was OK and power should be turned back on.

As of 21:15 EST there has still been only the original L1 Initialize at
18:30  so nothing has been done to investigate the problem #1 reported above
about the CTFE comparator   (e.g. a second Initial was not done to see if
the problem would repeat itself  e.g. was this were the water was spraying).

Edmunds having not been contacted except for the 18:15 call and having stayed
off of the phone so as not to block any incoming calls) is not aware of
any of the details of the water leak investigation or the decision to turn
back on (Lum was about 4.0).  Just two weeks ago we had a near miss with
disaster caused by a water leak in a BLS rack.  Because the RMI for that
rack repeatedly tripped it off people finally believed that there might be
a problem.  Without our L1 RMI we are flying blind.  To date we have had
zero false alarms from the L1 RMI drip detector so there is no reason not
to take such alarms seriously.   None of the people who are familiar with
the past history of L1 water leaks were contacted before some one took the
decision to turn L1 back on.
..............................................................................

Date:  21-NOV-1994     At: MSU    Topics: Look at L15CT statistics from three
                                          more nice stores.

After about 13 hours of continuous running starting from Lum of about 9.5 E30

S-15C/HDL% 68k parked Status ok (Load_Code Interrupt)   20-NOV-1994 00:02:48.48
S-15C/HDL% 68k never had to Un-Stick the DSPs  %% time: 20-NOV-1994 00:02:48.55
S-15C/HDL% 68k never saw any Byte Misalignment Problem in Object Lists

S-15C/HDL% Reading 68k Run Counters...         %% time: 20-NOV-1994 00:02:49.75
S-15C/HDL%...Orbit Master Loops           Count = -1197666861
S-15C/HDL%..."That's Me" With Transfer <N>Count = 1353435
S-15C/HDL%..."That's Me"   NO Transfer <n>Count = 858472
S-15C/HDL%..."Bystander" With Transfer <I>Count = 3874102
S-15C/HDL%..."Bystander"   NO Transfer <i>Count = 20499323
S-15C/HDL%..."Mark&Pass" With Transfer <F>Count = 22
S-15C/HDL%..."Mark&Pass"   NO Transfer <f>Count = 0
S-15C/HDL%..."Un-Stick"  With Transfer <E>Count = 0
S-15C/HDL%..."Un-Stick"    NO Transfer <e>Count = 0
S-15C/HDL% Put all DSPs in Reset, ready for code download
                                               %% time: 20-NOV-1994 00:02:50.54
Statistics:  Number of MFP events:  22
             Number of "Un-Stick" GDSP from Step D3:  0
             Total Number of events processed by L15CT:  2,211,907
             % of events processed by L15CT and NOT Transfered up to L2:  38.8%


After about 14 hours of continuous running starting from Lum of about 9.5 E30

S-15C/HDL% 68k parked Status ok (Load_Code Interrupt)   20-NOV-1994 17:03:04.70
S-15C/HDL% 68k never had to Un-Stick the DSPs  %% time: 20-NOV-1994 17:03:04.77
S-15C/HDL% 68k never saw any Byte Misalignment Problem in Object Lists

S-15C/HDL% Reading 68k Run Counters...         %% time: 20-NOV-1994 17:03:05.97
S-15C/HDL%...Orbit Master Loops           Count = -976982064
S-15C/HDL%..."That's Me" With Transfer <N>Count = 1450241
S-15C/HDL%..."That's Me"   NO Transfer <n>Count = 901330
S-15C/HDL%..."Bystander" With Transfer <I>Count = 4367174
S-15C/HDL%..."Bystander"   NO Transfer <i>Count = 21556732
S-15C/HDL%..."Mark&Pass" With Transfer <F>Count = 23
S-15C/HDL%..."Mark&Pass"   NO Transfer <f>Count = 0
S-15C/HDL%..."Un-Stick"  With Transfer <E>Count = 0
S-15C/HDL%..."Un-Stick"    NO Transfer <e>Count = 0
S-15C/HDL% Put all DSPs in Reset, ready for code download
                                               %% time: 20-NOV-1994 17:03:06.76
Statistics:  Number of MFP events:  23
             Number of "Un-Stick" GDSP from Step D3:  0
             Total Number of events processed by L15CT:  2,351,571
             % of events processed by L15CT and NOT Transfered up to L2:  38.3%


After about 15 hours of continuous running starting from Lum of about 9.5 E30

S-15C/HDL% 68k parked Status ok (Load_Code Interrupt)   21-NOV-1994 11:07:51.09
E-15C/HDL% 68k Last Un-Stick Action was for a problem at %X 000000D3...
E-15C/HDL%...Local  DSP A2=%XB3FF801F A3=%XB3FF801F A4=%XB3FF801F A1=%XB3FF801F
E-15C/HDL%...Local  DSP               B3=%XB3FF801F B4=%XB3FF801F B1=%XB3FF801F
E-15C/HDL%...Local  DSP C2=%XB3FF801F C3=%XB3FF801F C4=%XB3FF801F C1=%XB3FF801F
E-15C/HDL%...Global DSP B2=%XB327000F          %% time: 21-NOV-1994 11:07:51.56
E-15C/HDL%...a DSP not at D0     Un-Stick Count = 0
E-15C/HDL%...Global not at D3    Un-Stick Count = 1
E-15C/HDL%...a DSP not at D15    Un-Stick Count = 0
S-15C/HDL% 68k never saw any Byte Misalignment Problem in Object Lists

S-15C/HDL% Reading 68k Run Counters...         %% time: 21-NOV-1994 11:07:52.98
S-15C/HDL%...Orbit Master Loops           Count = -675499484
S-15C/HDL%..."That's Me" With Transfer <N>Count = 1446763
S-15C/HDL%..."That's Me"   NO Transfer <n>Count = 846762
S-15C/HDL%..."Bystander" With Transfer <I>Count = 5039252
S-15C/HDL%..."Bystander"   NO Transfer <i>Count = 23282802
S-15C/HDL%..."Mark&Pass" With Transfer <F>Count = 22
S-15C/HDL%..."Mark&Pass"   NO Transfer <f>Count = 0
S-15C/HDL%..."Un-Stick"  With Transfer <E>Count = 0
S-15C/HDL%..."Un-Stick"    NO Transfer <e>Count = 1
S-15C/HDL% Put all DSPs in Reset, ready for code download
                                               %% time: 21-NOV-1994 11:07:53.77
Statistics:  Number of MFP events:  22
             Number of "Un-Stick" GDSP from Step D3:  1
             Total Number of events processed by L15CT:  2,293,525
             % of events processed by L15CT and NOT Transfered up to L2:  36.9%

All 11 of the LDSP's show in the "# of Obj Found" part of the LDSP Status
Longwords that they overflowed their Object Lists i.e. found 9 or more objects.
The Status Longword from the GDSP says $27 in the "Terms Answers" byte.  I
understand the "7" part but not the "2" part.
..............................................................................

Date: 18,19-NOV-1994   At: D0 Hall  Topics: Air Flow Sensor Tied Down, Bring
                                    more de-H-ed MBD's to D0 Hall, Electronics
                                    Board meeting for Muon,  Inventory of
                                    CTMBD's,  Find out how PhotoHelic Limit
                                    Switches work, L15CT Pulser Run ConFig file,
                                    L15CT 68k_Ser Counters for a long run.

 ----->  M101 - M102 Air Flow Sensor is still TIED DOWN  RPSS Sensor  <-----

Bring more de-H-ed MBD's and CTMBD's to D0 Hall
Bring MBD's SN#7, and SN#17 back to D0 after de-H-ing them at MSU.
Bring CTMBD SN#14 back to D0 Hall after de-H-ing it at MSU.  CTMBD SN#14 had
been in use in M109 Tier 2 but when we started Data Block Builder reading
LTCC cards then CTMBD SN#14 had a problem with data bit of value 4 on only
the first read, i.e. the first read after this CTMBD recognised its MBA.
CTMBD SN#14 has had its 10H101's removed from the bus driver section.  See
the log book entries from 3 and 10 FEB-1994 for more details about this CTMBD.
There are now 4 CTMBD's and 2 MBD's in the spares cabinet at D0 Hall.

Other circuit board and 10H101 considerations:
How many of the CTMBD's that are currently in use in Tier 1, Tier 2, Tier 3,
and other locations have not had their 10H101's pulled.  Are there any other
cards in use that still have 10H101's  e.g. the TLM's in the top of M102.
Are there other cards still at NWA that we should recover to de-H them.

         Number of
   Rack   CTMBD's               Functions
   ----  ---------  ----------------------------------------------------------
   M101      1      L1 FW Timing Signals to And-Or Input Terms, bottom card
   M102      2      Spec Trig Fired - Start Digitize Backplane, FSTD Backplane
   M103      4      L1CT Final Readout,  L15 Framework,  two Tier 1's
   M104      2      two Tier 1's
   M105      3      two Tier 1's and one Tier 2
   M106      2      two Tier 1's
   M107      3      two Tier 1's and one Tier 3
   M108      2      two Tier 1's
   M109      3      two Tier 1's and one Tier 2
   M110      2      two Tier 1's
   M111      3      two Tier 1's and one Tier 2
   M112      2      two Tier 1's
   M114   +  1      lower M114 backplane DBSC's Foreign scalers
         -------
            30  number of CTMBD's in use at D0,
                plus 4 spare at D0, plus 1 in MSU Test Rack.  ---> 35 total ??

The only non-H CTMBD's appear to be: both cards in M102, the Tier 2 in M109,
the 4 spares at D0 Hall, and the card in the MSU Test rack.  This leaves 27
CTMBD's that still have 10H101's.  The CTMBD in the MSU Test Rack is a mess
and can only be used there.  Is it OK to pull just the 10H101's from the bus
driver section or does one also need to replace the parts in the LED and Lemo
driver sections.

Replacing the parts in the CTMBD's may make sense because in principal these
cards will need to continue into Run II.  Replacing the parts in the TLM's in
the top of M102 may also make sense because a mistake in TAS number requires
a Data Cable resync and wastes a lot of time.  Perhaps can do something during
the February shutdown.

PhotoHelic Differential Air Pressure Sensor

        B   A       Layout of pins from the rear view of the
      C   H   F     AMP Hex connectoron the PhotoHelic gauge
      D       E

     Connections:       Upper Limit Switch:  E,F
                        Lower Limit Switch:  C,D
                        Bulb:                A,B
                        no connection        H

When cold the bulb has about 4.8 Ohms resistance.  It may operate at very
low power, i.e. mostly as an IR emiter, for long life.

The detectors are photo resistors.  When pulled out of their narrow slot
holders they look like 2k in room light and >100k in the dark.  When in their
narrow slot holders they look like 120k in room light and 20k to a flashlight
held at about 1 foot distance.  The detectors are part No.  CL905L 421.
The Lower Limit detector is dark until the pressure is > the lower limit.
The Upper Limit detector becomes dark when the pressure is > the upper limit.

L15CT 68k_Ser counters
Look at L15CT 68k Counters after 16 hours of continuous running starting
from a luminosity of 9.0E30
                                               %% time: 19-NOV-1994 08:05:16.49
E-15C/HDL% 68k Last Un-Stick Action was for a problem at %X 000000D3...
E-15C/HDL%...Local  DSP A2=%XDAFF801F A3=%XDAFF801F A4=%XDAFF801F A1=%XDAFF801F
E-15C/HDL%...Local  DSP               B3=%XDAFF801F B4=%XDAFF801F B1=%XDAFF801F
E-15C/HDL%...Local  DSP C2=%XDAFF801F C3=%XDAFF801F C4=%XDAFF801F C1=%XDAFF801F
E-15C/HDL%...Global DSP B2=%XDA27000F
E-15C/HDL%...a DSP not at D0     Un-Stick Count = 0
E-15C/HDL%...Global not at D3    Un-Stick Count = 1
E-15C/HDL%...a DSP not at D15    Un-Stick Count = 0
S-15C/HDL% 68k never saw any Byte Misalignment Problem in Object Lists

S-15C/HDL% Reading 68k Run Counters...        %% time: 19-NOV-1994 08:05:18.28
S-15C/HDL%...Orbit Master Loops           Count = -579710553
S-15C/HDL%..."That's Me" With Transfer <N>Count = 1577255
S-15C/HDL%..."That's Me"   NO Transfer <n>Count = 949188
S-15C/HDL%..."Bystander" With Transfer <I>Count = 4953894
S-15C/HDL%..."Bystander"   NO Transfer <i>Count = 23877085
S-15C/HDL%..."Mark&Pass" With Transfer <F>Count = 25
S-15C/HDL%..."Mark&Pass"   NO Transfer <f>Count = 0
S-15C/HDL%..."Un-Stick"  With Transfer <E>Count = 0
S-15C/HDL%..."Un-Stick"    NO Transfer <e>Count = 1
S-15C/HDL% Put all DSPs in Reset, ready for code download   19-NOV 08:05:19.07

So in this 16 hours of continuous running we had  1  unstick GDSP from not
reaching Step D3.  We had only 25 MFP events (i.e. we can hardly use this
for data transport error checking).  There would have been almost 1000 in spill
pulser events but none of these should overlap with a "physics" trigger that
is using L15CT so that should not have caused the GDSP Step D3 Timeout.
Once again all DSP status longwords actually look OK so this must have taken
just slightly over the timeout period.

All 11 of the LDSP's show in the "# of Obj Found" part of the LDSP Status
Longwords that they overflowed their Object Lists i.e. found 9 or more objects.
The Status Longword from the GDSP says $27 in the "Terms Answers" byte.  I
understand the "7" part but not the "2" part.

Number of events processed by L15CT (i.e. "N" + "n")   2526443
Fraction of the time when L15CT processed the
event AND the event was NOT transfered up to L2        37.6%

L15CT Pulser Run
The problem with the L15CT pulser run having only one pattern is that the
pulser needs the TAS protocol to complete in one Beam Crossing cycle in order
to thank that all when OK and thus it should increment.  Jan is setting up
a "parallel" pure L1 trigger to cause this to happen.  It now looks like:

CFG_CAL:CALOR_PLS_TRIG_LOW.CAL;1

    @CFG_CAL:Calor_pls_trig_low.trig
         @CFG_CAL:calor_pls_trig.lev1       <-- sys "With cal L15 requirements"
         @CFG_LV0:L2_pass_fail.filt
         @CFG_CAL:calelec_detector.req
              @CFG_LV0_CRATE:trig_level1.req
              @CFG_CAL_CRATE:cetec_random.req
              @CFG_CAL_CRATE:cal_inspill_reset.req
              @CFG_CAL_CRATE:cal_norm_ccn.req
              @CFG_CAL_CRATE:cal_norm_ecnw.req
              @CFG_CAL_CRATE:cal_norm_ecne.req
              @CFG_CAL_CRATE:cal_norm_ccs.req
              @CFG_CAL_CRATE:cal_norm_ecsw.req
              @CFG_CAL_CRATE:cal_norm_ecse.req
         @CFG_CAL_CRATE:trig_l15ct.req
         @CFG_CAL_CRATE:calor_mpls_low.req

    @CFG_CAL:Calor_pls_trig_l15.trig
         @CFG_CAL:calor_pls_trig_l15.lev1
              @cfg_cal:cal_pls_trig.l15              <---- file does not exist
         @CFG_CAL:calelec_detector.req
              @CFG_LV0_CRATE:trig_level1.req
              @CFG_CAL_CRATE:cetec_random.req
              @CFG_CAL_CRATE:cal_inspill_reset.req
              @CFG_CAL_CRATE:cal_norm_ccn.req
              @CFG_CAL_CRATE:cal_norm_ecnw.req
              @CFG_CAL_CRATE:cal_norm_ecne.req
              @CFG_CAL_CRATE:cal_norm_ccs.req
              @CFG_CAL_CRATE:cal_norm_ecsw.req
              @CFG_CAL_CRATE:cal_norm_ecse.req
         @CFG_CAL_CRATE:trig_l15ct.req

The much easier thing to do is to use just one Spec Trig and just not tell
the L15 Framework that this Spec Trig requires L1.5 confirmation.
..............................................................................

Date: 9,10,11-NOV-1994  At: D0 Hall  Topics: Air Flow Sensor Tied Down, Bring
                                     Spare VME Modules to D0 Hall,  Bring more
                                     de-H-ed MBD's to D0 Hall, Try to Look at
                                     And-Or IMLRO T5 mismatch,  L15CT_Pulser
                                     Runs for L15CT_PROV,  Test having TCC read
                                     L15CT 68k_Ser scalers and DSP status while
                                     L15CT is processing events, Trouble
                                     starting data taking for the Friday
                                     morning store

 ----->  M101 - M102 Air Flow Sensor is still TIED DOWN  RPSS Sensor  <-----

Bring the following VME Modules to D0 Hall:

    Short 214  MSU SN#1
    FANCY 214  MSU SN#13
    MVME-135   MSU SN#2

These were added to the existing stock of spare modules already at DZero:

    "V" Type 214  MSU SN#5
    IRONIC I/O    MSU SN#7
    VMX DRIVER    MSU SN#4
    TERM SELECT P2  MSU SN#2

All 7 of these modules were moved to the bottom of the Spare Cards Storage
Rack (where the spare Hydra-II is also stored).  The TrgBook:VME_Inventory.LBK
file was brought up to date.

Bring more de-H-ed MBD's to D0 Hall
Bring MBD's SN#5, SN#11, SN#15 back to D0 after de-H-ing them at MSU.

     Location     MBA    Pull MBD     Install MBD
    ----------   -----  ----------   -------------
    M101 FSTD     132      SN#17        SN#5
    M101 Busy     135      SN#7         SN#15
    M114 Upper    105      SN#4         SN#11

MBD SN#11 which was just installed in M114 upper backplane, MBA=105, is
wired as a AND-OR MBD.  MBD SN#4 which was just pulled out of M114 upper
backplane had no Timing Signal wire wrap wiring on it.

Take MBD's SN#4, SN#7, and SN#17 to MSU for De-H-ing.  MBD SN#4 is Rev A.

We have been making a mistake since day one with the way that we setup the
MBD's and the CTMBD's.  We have not been wiring anything to the 10H115
receivers for the timing signals that are not used on the Specific Backplane
CBus.  Thus for a 10H115 that services some used channels and some "nothing
connected" channels, the input bias network is screwed up for all of its
channels.

This is probably not too big of a problem with the CTMBD's where the 10H115's
are directly driven by 10H116's but it is still wrong event there.  It is
definitely a problem on the MBD's where the 10H115's are trying to receive
timing signals over long cables.

What we should do on the MBD's is start wiring the unused channels of the
Specific Backplane CBus to channel 16 of the Timing Bus i.e. the LED ON
Timing Signal which never moves.  For the CTMBD's we could use one of the
unused Cal Trig Timing Signals for this purpose (e.g. Cal Trig MTG
Channels 23, 24, or 32).

The ECL data book says not to leave disconnected any of the inputs to a
10H115 and I know that this really does cause problems from my experience
with the ECL scope boxes which now have bias resistors on their inputs
so that they work the same with just one channel in use as with all 4 channels
in use.

I edited the [D0_Text.Timing_and_Control]MBD_and_CTMBD_Timing_Signal_Wiring.Txt
to include a warning about connecting something to these unused Specific
CBus Timing Channels.

Try to Look at And-Or IMLRO T5 mismatch
Make a version of VTC_Test called VTC_Test_2 that when it finds a T5 error
it prints out first the 2 hex digits from the Spec Trig 0:15 IMLRO and then
the 2 hex digits from the Spec Trig 16:31 IMLRO.  This makes a total of 8
characters that get send to the VTC terminal upon T5 errors.  Will the L2
Sequencer wait this long?  I have loaded VTC so that are expected to be Pilot
COMINT Timeouts.

L15CT Pulser Runs for L15CT_PROV
The config file for making the L15CT Pulser runs are setup as follows:

    CFG_Cal:Calor_PLS_Trig_Low.Cal
         CFG_Cal:Calor_PLS_Trig_Low.Trig
              CFG_Cal:Calor_PLS_Trig_Low.Lev1
                   CFG_Cal:Calor_PLS_Trig_Low.L15
                        CFG:Cal_PLS.RS
              CFG_Cal:CalElec_Detector.Req
              CFG_Cal_Crate:Trig_L15CT.Req
              CFG_Cal_Crate:Calor_MPls_Low.Req

The files for High and Low amplitude have the obvious differences in file name.
The indentation here indicates who calls whom.  This whole setup is a little
strange (e.g. look at the distribution of what is in what directory, how do
you know if it is safe to edit a subordinate file because what other master
file may be calling it?).  For now we are developing a setup that will use
the same config file for the L15CT Pulser run as for the Dan Owen Pulser run.

When this was first tried there were nothing but Token Loop Count Overflows
from the L15CT crate.  This was because the L15CT Ref Set was set at 2.0 GeV
and every TT in the world was a candidate for the object list.  This caused
a lot of LDSP processing which caused GDSP to be late making it to step D3
which caused 68k_Ser to timeout DSP processing and to produce no L15CT Data
Block which caused the Token Loop Count Overflow on the L15CT Crate.

For now this has been "Fixed" by setting the L15CT Ref Set to 1000 GeV for all
Trigger Towers.  It is not clear if we should also move the 68k_Ser Timeout
up by a little bit.  Steve finds that we have timed out 7 times during Global
Physics running since 30-Oct-1994.  This timeout had been kept short because at
one time it was thought to be useful to try to "salvage" events that died
during L15CT processing.  For the past N months when ever 68k_Ser times out
the DSP processing of an event, the event is flushed when the L2 sequencer
cleans up Data Cable 0.

Expected values of data from the Low and High amplitude L15CT pulser runs:

       EM     |Eta|=1     |Eta|=20                HD     |Eta|=1     |Eta|=20
   A      +  ---------   ----------           A      +  ---------   ----------
   M   Low|    140        1 or 2              M   Low|     85        1 or 2
   P  High|   Saturate      19                P  High|   Saturate      13

These are order of magnitude the MAXIMUM values that one will expect to see.
These represent what happens when a Pulser Pattern, by chance, hits some where
in all 4 Cal Towers that make up a Trigger Tower.  You also will set values
of approximately 3/4, 2/4, and 1/4 of what is shown above.  The point of all
of this is that it looks like (except at high eta) we will get prety good
bit coverage of the L15CT with the pulsers setup as they are for Dan Owen
Pulser Runs.

Test TCC read 68k_Ser scalers and DSP status while L15CT is processing events
While we are in the middle of a global physics run I started having TCC read
from L15CT information from either 68k_Ser or from DSP status words.  From
a .com file TCC was asked to loop through  L15CTSYS DSP_STAT 68K_CTRL
68K_STAT 68K_ERR 68K_CNT 68K_FLAG waiting 10 seconds between each step.  This
did not appear to cause L15CT an problems,  i.e. L15CT was not bothered by
having a usec here or there taken by TCC caused VME cycles and the VME
mastership transfer is working OK even when L15CT is processing events.

Trouble starting data taking for the Friday morning store
Something must have been going on early, e.g. there were initializes at:
    Initialize Starts     Initialize Done
    -----------------     ---------------
        8:53:38              8:54:29
        8:55:30              8:56:21
        8:56:39              8:57:30
        8:59:17              9:00:08
        9:04:29              9:05:20
        9:12:00              9:12:51
        9:12:58              9:13:49
        9:16:20              9:17:11

Then at 9:17:53 COOR starts a full trigger download to TCC.  At this time the
log file indicates that TCC was seeing fresh data.  As part of the download,
COOR pauses L1FW at 9:23:05 and so that it can load L15CT which it starts to do.
Then we see:
C-RCV/CH1%   1:35  %000025FF L15CTSYS    START CRATE(0)    %% time: 09:23:24.46
S-15C/HDL% Preparing Params for L1.5 CT Crate  %% time: 11-NOV-1994 09:23:24.53
S-15C/HDL% Copying Params to L1.5 CT Crate     %% time: 11-NOV-1994 09:23:24.60
S-EXC/MBX% Flush_to_File now Servicing Exception Mailbox   %% time: 09:23:31.11
X-DSP/EXC%2203468%PAS-F-FILALRACT, file already active     %% time: 09:23:30.85
X-DSP/EXC%Skipping                             %% time: 11-NOV-1994 09:23:30.85
S-EXC/MBX% Exception Mailbox now empty         %% time: 11-NOV-1994 09:23:31.48
 TRICS V6.3   CLOSED LOGFILE, DUA0:[TRIGGER]TRICS_30OCT94.LOG %% ti 09:23:31.48
C-RCV/CH2%   1:26  %00000001     PHAT CLOSELOG %% time: 11-NOV-1994 09:42:47.84

It should have completed the  "Copying Params to L1.5 CT Crate" in about 6 or
7 seconds, i.e. at about 9:23:30.    Is it OK that the log entries
"PAS-F-FILALRACT, file already active" and " Skipping" are out of time order?
..............................................................................

Date: 2,3,4-NOV-1994  At: D0 Hall  Topics: Air Flow Sensor Tied Down, L15CT
                                   Trigger Tower data problem,  Replace MBD's
                                   with non-10H MBD's,  Collect L1 Trig overlap
                                   data, Run Find-DAC, Spares that I need to
                                   bring to D0 Hall, Monitor L15CT operation

 ----->  M101 - M102 Air Flow Sensor is still TIED DOWN  RPSS Sensor  <-----

L15CT Trigger Tower Data
Work on the L15CT Trigger Tower data problem that Dan Owen discovered last
weekend.  The problem was that Local DSP A3 had its Rack #2 Tot Et data bits
of value 2 and 4 stuck low.  A longword of Rack #2 Tot Et data from A3's
Type #1 DeBug Section when there was no energy in the calorimeter should
read $1A1C1C1C but it was reading $18181818.  The path for this data is
Lower CRC Ch#2 Copy #1 (i.e. rear row), to the Ten Port Paddle 4 connector
side next to top connector,to Comm Port #3 on DSP A3.  Note that this same
data goes to Local DSP A4 Comm Port #2 where it was reading out OK.

The problem was in the CRC to DSP Paddle Board cable.  At the DSP end of the
cable the receptacles for data bits of value 2 and 4 have no spring force
in there contacts.  I expect that someone bent these contacts with a paper
clip (attempting to test the cable) or else it was a defective connector to
start with.

I made another cable, tested it, and installed it.  The labels from the old
cable were put on the new cable.  The new cable is installed OK but it is not
threaded into the Panduit cable tray in a very fancy way.  I was afraid to move
too many cables too much because it would be easy to pull out a connector.
I cut the ends off of the old cable for autopsie but I did not un-thread the
rest of the old cable from the Panduit cable try.

After replacing this cable I ran the L15CT Test Trigger to get some events
moving and captured one event via ZBDump in the file
        VWork1:  L15CT_ZBD_Dump_2Nov94.Txt
This file has both the L1 data and the L15CT data including MFP data.  I wrote
a small table at the top of this event to help navigate through it.
If there are any old junky ZBD files around we should get rid of them.

Collect L1 Trigger Overlap Data
Between about  21:16 and 23:20  collect L1 Trigger overlap data in the file
VWork1:  SpTrg_Fired_List_2215_2Nov94.Txt    The average Ch#13 D-Zero
Luminosity was 7.2E30.  The  V10.0 08E30  prescale file was in use during
this time.   The "Edmunds overlap analysis" of this SpTrg_Fired_List file is in
VWork1:  SpTrg_Fired_Analysis_2215_2Nov94.Txt.

DeHed MBD's for L1 Framework
Bring MBD's SN#13 and SN#16 both non 10H101 MBD's to Fermi.  MBD SN#16 was
removed from lower M101 AND-OR last week and taken to MSU for de-H-ing.  MBD
SN#13 was pulled out of the MSU Test Rack and de-H-ed.

while checking MBD SN#13 at Fermi I noticed something funny about the
resistance of the Spec Front CBus Inverted MS Data Bit line.  It is only a
couple of hundred ohms to GND.  Trace this to the 10H188 driver for the front
CBus U20.  Pull this chip and install a socket.  Now MBD SN#13 and SN#16 have
resistances that look the same.

I also notice that on Rev. B MBD's that the 10H188 drivers for the mid and
high data bits of the Spec Front CBus (i.e. U15 and U20) have their pin #16
only pick up Gnd via a trace over the top to pin #1.  But it is easy to fold
pin #16 down against the solder side Gnd plane to pick up Gnd.  I make this
"modification" to MBD's SN#13 and SN#16.

While studying the print set for the MBD's I notice that the 10H101's are in
the CONTROL circuit that tells the MBD which way to drive data and if it
should be driving data anywhere or just holding the Spec Front CBus at low.
Thus oscillating 10H101's could really make global problems for MBD operation.

Prepair CTMBD SN#09 along with a CCCP card to install in the FSTD Cell
backplane in rack M102.  Eventually I plan to also do this in M101 but I want
to try this one backplane at a time.  I also want to fix a broken CTMBD at
MSU to bring here as a spare before committing to use 2 of them in FSTD
backplanes.

M101 Upper And-Or backplane; pull MBD SN#15 and install MBD SN#16.
M102 Upper And-Or backplane; pull MBD SN#11 and install MBD SN#13.
M102 FSTD Cell backplane; pull MBD SN#05 and install CTMBD SN#09 and a CCCP.

MBD's SN#5, SN#11, and SN#15 will return to MSU for de-H-ing.

Remember that when replacing a MBD with a CTMBD and a CCCP that the CBus and
Timing Buss cables from M114 need to plug into a different location on the
backplane for the MBD than for the CTMBD.

Now with all And-Or backplanes running with de-H-ed MBD's try using the special
version of VTC code that reports mismatches between the two halves of the
And-Or Network.  This shows very few errors.  Less than one per screen full of
1's and 0's.  They are all "T5" mismatches.  There are few enough of these
errors that I leave this version of code in over night.  In the morning,
running with a lower prescale file, it is possible that the "T5" mismatches are
slightly more frequent (perhaps almost one per screen full).  I doubt that this
is any longer a MBD problem.  It could be: bad IML, bad IMLRO, bad And-Or
Backplane.  It is also very possible that muon Level 1 signals  (T5 is And-Or
Terms 32:39) are still moving at the time of the rising edge of the IML Clock
signal.  Best estimate of what to do next is to make a special version of VTC
that prints the value read from both IMLRO's.  It is possible that at high
luminosity that there are fewer "T5" mismatch error because the activity in the
muon L1 Trig is different at high Luminosity.

Find_DAC_Bytes
Made a full sweep of Find-DAC.  There was only one TT that failed (which was
an excluded TT).  Find-DAC had not been run since 17-AUG-1994 although some
hand patches had been made.  Did a Read-Load of the new file so with the next
TRICS Init it will start using the new values.  There was very little change:

            3 tower(s) incremented by -2
          148 tower(s) incremented by -1
         2276 tower(s) incremented by  0
          128 tower(s) incremented by  1
            5 tower(s) incremented by  2

Spares that I need to bring to D0 Hall
I need to bring more VME stuff to support L15CT e.g. 214's, 135, ?
Wire wrap wire (other than black),  Small PROM Labels,

Monitor M15CT Operation
Steve collected the file  VWork1: TrgMon_Dump.Txt_L15CT_V10_Initial_Run  which
has lots of nice information about the initial view of L15CT as it is actually
dumping events.  Today I used long integration TrgMon to collect more L15CT
functioning data.
With luminosity of 5.2 E30 and running the V100-6E30 prescale file one sees:

   Spec Trig      Trig Name     L15 Reject %     L15 Skip %
   ---------      ---------     ------------     ----------
        7         EM_1_High          62              23
        8         EM_2_Med            6              42

These Reject percentages do not seem to change very much with luminosity.
When this information is compared with the measured L1 Trigger Overlap data
it is not completely clear why there is such a big difference in the L15
Skip percentages.  Both of these triggers have a lot of overlap with Spec
Trig #11 (about 34%), which is also processed by L15CT, but it requires zero
objects and thus always passes.
..............................................................................

Date: 1-NOV-1994  At: MSU   Topics: Start Trigger List V10.0  --->  Start
                                    L15 Cal Trigger Throwing Away Events.

   As of the first run in the early AM on 1-NOV-1994 Trigger List V10.0 was
   in use.  This Trigger List uses L15 Cal Trig to actually filter events.

..............................................................................

Date: 26,27,28-OCT-1994   At: Fermi  Topics: Air Flow Sensor Tied Down, Bring
                                     spare cards to Fermi, Work on the bad
                                     data signals from M101,  Bring more IC's
                                     to Fermi for the Chip Kit,  Meeting with
                                     Atlas people,  DAQ Conference,  Swap MBD's
                                     in the M101 lower And-Or crate, review
                                     T% error distributions

   ----->  M101 - M102 Air Flow Sensor is TIED DOWN  RPSS Sensor  <-----

Bring MBD SN#18 (non 10H101) and CTMBD SN#10 to D-Zero Hall as tested spare
cards.

On the lower And-Or crate in M101 put the "T cable" onto the front CBus and
take a look at all the signal lines.  Every thing looks just fine: Timing
Signals, Card Address, Function Address, and Data lines.  I can not see
Direction and Strobe do anything but that is OK.

With the power turned off use the Fluke to look at the CBus cable between M114
and M101.  Look from the M114 end (using the 34 pin connector that was put on
last week.  This only lets me look at Data lines, Direction, Strobe, and the
high order 7 Function Address lines.  Can see back to the terminator in M101
and verify that there are no open lines and that the terminator is OK and that
there are no line to line shorts.  Everything looked OK.  Could not check
MBA lines, CA lines, or the low order FA line because there was not a connector
installed on these signals in M114 yet.

In the CBus cable between the M101 MBD's and the M114 BBB crush on another
34 pin connector so that one can see the Mother Board Address lines, the Card
Address lines, and the Function Address line of value 1.  These signals all
look just fine on the scope.  Also this allowed me to verify (because I can
read the MBA) that it is the lower M101 And-Or Crate that has the bad looking
data lines.

So the bad looking data line problem from the lower And-Or crate in M101 must
be one of the MBD cards in M101.  It could be either that the lower And-Or MBD
has a problem driving the lines or else that one of the other 3 MBD's in M101
is leaking onto the bus back to M114 when it is not being addressed.

Bring more IC's to Fermi for the Chip Kit here.  This is a bit funny because
it is "illegal" to repair and test a card in the running system at D-Zero Hall.
The Chip Kit at Fermi now has some of each of the following:

   74  LS 00            10   101           Sockets  Solder Tail: 16, 20, 24
   74 ALS 04            10 H 102           110 Ohm 8 Resistor DIP Packs
   74 HCT 08            10   103           CTFE PAL'S: 1,2,3,4
   74 ALS 30            10   104           PAL's:  16V8  16RA8  16R8
   74 ALS 32            10 H 104           MTG PROM's  245A's
   74 ALS 74            10   109           COMINT PROM's  265's
   74 ALS 86            10 H 115
   74  LS 123           10 H 116
   74 ALS 138           10 H 124
   74   F 139           10 H 125
   74 ALS 240           10 H 131
   74 ALS 245           10   133
   74 ALS 520           10 H 162
   74 ALS 540           10 H 166
   74 ALS 541           10 H 173
   74 ALS 574           10 H 188
                        10 H 189

Swap MBD's in the M101 lower And-Or crate.  Pull MBD SN#16 and install
MBD SN#18 a non 10H101 MBD.  The data line value of 4 now looks OK for
the data coming from this crate.  Start the error checking version of
VTC code running.  During a Global Physics run there now appears to be
about one mismatch in 2000 events (i.e. one per screen full on the VTC
console).  Compare this to last week when almost every event had a mismatch
in the And-Or Term 128:255 range.  Looking at things now, the mismatch
histogram is:
                 CODE  Counts   And-Or Terms #'s
                  T5     30        32:39
                  TC      5        88:95
                  TF      5       112:119
                  TI      2       136:143

While reloading VTC (to start running the And-Or Term readout match error
checking version) Philippe notice that TCC started complaining about Pilot
COMINT timeouts.  This fits with what we guessed last week, i.e. at least
some of the Pilot Timeouts are caused by VTC reloads.  This also was seen
when we returned the "standard" VTC code.  These were 28-OCT-1994 at
7:30 AM and 12:15 PM respecitvely.

While D0 was running muon-only triggers at between 14 and 200 Hz, Steve
looked again at the distribution of T% errors on the VTC console.  Now
there were many more errors per VTC screen, typically between 2 and 7
errors per screen.  The average number of errors per screen was order of
4.  ALL of these errors were T5 errors.  At 200 Hz it is difficult to
accurately count errors on the screen, but most readings were taken at
low rate running or while the data cable was re-synching (i.e. VTC console
relatively stable).  We never saw multiple T% errors on a single event,
and the time between T5 errors was approximately uniformly distributed
between 1 row (or fraction of a row, i.e. 80 or fewer events) to about
10 rows.

We looked at the distribution of MBD's in the Framework (and M114) racks.
Here is what we saw:

                # of FW MBD's       # of FW MBD's
    Rack #        (de-H'ed)         (not de-H'ed)       # of CT MBD's
    ------      -------------       -------------       -------------
    M101                1                   3                   0
    M102                1                   2                   1
    M103*               0                   0                   2
    M114                0                   1                   1
                      ----                ----                ----
                        2                   6                   3

We need to start a project to de-H the 6 FW MBD's.

    * M103 FW Expansion Backplane, consisting of L1.5 Framework and
      L1 Cal Trig Final Readout halves
..............................................................................

Date: 19,20,21-OCT-1994   At: Fermi  Topics: Work on the readout problem from
                              M101 and M102 lower And-Or backplanes, Error in
                              the BBB description, Edit the Set_IML_FF.DAT file,
                              SBSC readout cycles,  Connectors to monitor CBus
                              data install in M114, Hand edit Init_DAC_BYTES.LSM
                              Notice Pause Timouts of Pilot COMINT.

Turn off most of L1 Cal Trig.  Leave only the first 4 racks of Tier 1 running
so that VTC does not find a pilot data error on each event.  This means that
all of the And-Or Terms 128:255 are undefined but typically drift high.

Running the VME code that compares all 255 And-Or terms and running with
the IML's set for real input data one could see errors on almost every event.
These included some "T9" errors not seen last week.  All other errors were
in And-Or Input Terms 128:255.

Switch the IML's over to all $FF test data and the number of VTC discovered
errors went way down (perhaps one in 50 events).

Running Test Trigger at about 28 Hz watch the Front-CBus signals on the lower
And-Or backplane.   Everything looks OK.  All Data Lines are OK and the CA
and FA lines look fine.  This is with the IML's setup for $FF test data.

Pull the MBD SN#18 from M102 lower And-Or backplane and install MBD SN #19
which is a non 10H101 MBD.  Swap the IMLRO's in M102 between the lower and
upper And-Or backplanes.  IMLRO SN#10 started out in the lower and IMLRO
SN# 11 started out in the upper.  With the IML's using real data we are
back seeing lots of errors (almost every event) still including some T9
errors.

Pull the BBB's in M114 for CBus 2 that service the L1 Cal Trig Tier 2 and
Tier 3.  This makes no difference.

This makes a space where I can plug the ECL to scope box into the backplane
BBB to COMINT bus.  Data line of value 8 looks fine.   Data line of value 4
has sections that look funny.  In these funny sections the lows look OK but
the highs have hash and are not a full high.  I did not carefully check other
data lines.  What I'm looking at is the data block builder running at 10 to 20
Hz.  Now what section of the COMINT Data Block Builder PROM's do these funny
sections correspond to?

There are about 1618 reads in the CBus #2 COMINT PROM's.  This takes about
950 usec  i.e. about 580 nsec per read.   An IMLRO has 16 registers or about
9 usec to read.  So on the scope try to navigate to 1%.  There are some other
markers (e.g. where the reads are for Large Tile pattern and L15 Cal Trig
final readout IMLRO's are all visible because their BBB's are pulled out.

Thus trying to navigate on the scopes horizontal scale and look for landmarks
the areas that I think look funny on data line of value 4 correspond to the
two M101 And-Or IMLRO's.   People were running but on the next test I should
execute   Set_IML_FF.DAT  to have fixed high data to look at.

So before putting things back together and turning L1 Cal Trig on I swapped
the BBB for M101.  I pulled SN# 14 and installed SN#9.   BBB SN#9 is a card
that has been used before.  See logbook notes from 22-DEC-1993 and 10-FEB-1994.
I did not check the BBB to COMINT backplane signals after swapping the M101
BBB.

It was learned that the BBB card description does not match the schematic of
the BBB.  The BBB does pass MBA, CA, FA, DIR, STRB, and write data independent
of whether or not it is being addresses.   We should edit the BBB description.

It was relearned that during the Monitor Pool Refresh when TCC reads the SBSC's
by hand that it needs to do Write Cycles with the Write Strobe active.  Are
these broadcast only on CBus #2 or are they also on CBus #0, #1, #3 ??

Edit  Set_IML_FF.DAT  so that the IML test data pattern for the And-Or terms
that carry Bunch Number show a legal Bunch Number  i.e. only one bit out of
the six is set.  I set the Bunch Number And-Or bit that is readout on data line
of value 4.  This change was necessary so that VTC would not report a non-
valid Bunch Number on every event and thus make it harder to see the And-Or
IMLRO read problems.

In rack M114 on the CBus cables from M101 and M102 install a 34 pin connector
(covering FA2:FA8,Strobe,Direction,D1:D8) so that one can monitor the CBus
signals to/from these racks.

Look at the Data from the AND-OR backplane  IMLRO's

                        Rack M101                         Rack M102
Data Line        ------------------------          -------------------------
  Value            0:127         128:255             0:127         128:255
---------        ----------    ----------          ----------    ----------
    1                OK           Bad                  OK            OK
    2                OK          Not too Bad           OK            OK
    4          Bot OK Top Bad    VERY Bad              OK            OK
    8                OK          Not too Bad           OK            OK
   16                OK            OK                  OK            OK
   32            Top Fuzy        Fuzy Top&Bot          OK            OK
   64                OK            OK                  OK            OK
  128                OK            OK                  OK            OK

   Order of problems:     OK  >  Fuzy  >  Not_too_Bad  >  Bad  >  Very_Bad

All data bits look OK for the readout of the FSTD Cell (FSTD cards and DBSC
cards) for both M101 and M102.

All data bits look OK for the readout of the M101:   Front-End-Buzy IMLRO,
(Str Dig + FEBz Disable + L2 Disable) IMLRO,  and the top card file DBSC's.

All data bits look OK for the readout of the M102:   Spec Trig Fired IMLRO.

What can be wrong in M101 ?:
1. Bad front CBus cables,
2. Bad And-Or Card with a 10H188 data driver that stays turned on,
2. Trouble with the And-Or card file MBD's (i.e. they can not properly drive
   data back to the BBB but they do properly clear off the bus when not
   addressed so that the two top backplanes in M101 readout OK,
3. Trouble with a MBD in the upper two backplanes of M101 e.g. one of these
   MBD leaks current all the time onto the bus back to the BBB.

Review the stuff from 21,22,23 JULY 1994 when the upper AND-OR backplane was
worked on in M101 for a similar or same problem

Last week Jan had found a couple of TT's that had drifted by one count.  She
saw these using the newest features of Cal CALIB.  Today hand edit
Init_DAC_Bytes.LSM  to bring these channels back to 8.  Copy the new version of
this file to D0HTCC:: and MSU::[Trg_Current.DZero].  The changes are:

           Channel   Was_Reading   Old_DAC   New_DAC
         -12,16 EM       7          34         36
         -12,16 HD       7          40         42
         -14,7  EM       9          35         33
         -14,7  HD       9          44         42

We notice that there have been some  Timouts of Pilot COMINT.  Some of these
look very likely associated with know boots of the VTC 68k.  Did DAQ EXP
cause the other cases by randomly booting VTC ???   This appears to have
happened 14 times since 1-JULY-1994.   Philippe has dug out the log files
and made a summary in  VWork1:Pause_Timout.Log
..............................................................................

Date: 13,14-OCT-1994   At: D0 Hall     Topics: Continue IMLRO readout testing

We continue testing IMLRO readout of ALL And-Or Input Terms.  Philippe
installed a new TCC system which allows "histogramming" of a single Function
Address, without de-asserting MBA, CA, etc.  Using IML Test-Data registers
to force ALL And-Or Input Terms true, we saw "bit 4" readback errors at low
rate (1 error per 10^6 reads).  Neither shutting off the MPOOL Server nor
halting the Framework MTG affected the error rate.  When the histogram
sample is big enough (e.g. 10^6) one can get more than 1 error per
histogram (seen 3 / 10^6 twice).  This proves wrong the potential theory that
it would be the first of the reads (when the address is just selected)
that can fail.

Dan installed (temporarily) a version of VTC code which compared ALL And-Or
Input Terms between M101 and M102.  Running at 1 Hz, we watched 300 events.
21 of these 300 events had errors (we did not check or display the "bit value"
of the error).  ALL of these errors were in the 16 IMLRO registers
corresponding to the upper And-Or Input Terms (i.e. CBUS = 2, MBA = 65/129,
CA = 50) The distribution was:

    And-Or Input        Function        Number of
    Term Range          Address         Errors
    ------------        --------        ---------
    128..135            128 ($80)       0
    136..143            129 ($81)       1
    144..151            130 ($82)       1
    152..159            131 ($83)       5
    160..167            132 ($84)       1
    168..175            133 ($85)       2
    176..183            134 ($86)       0
    184..191            135 ($87)       0
    192..199            136 ($88)       3
    200..107            137 ($89)       3
    208..215            138 ($8A)       3
    216..223            139 ($8B)       0
    224..231            140 ($8C)       0
    232..239            141 ($8D)       0
    240..247            142 ($8E)       1
    248..255            143 ($8F)       1

This distribution is approximately uniform (a uniform distribution would
have had 21/16 = 1.3 errors per Function Address).  Function Address 131
had 5 errors.  Errors were seen in FA's with each of bits with values 1,
2, 4, and 8 set in the low nibble.

It is significant that all errors were seen in the upper And-Or Input Terms.
It would be easy to blame the IMLRO for either Specific Triggers 0..15 or
16..31, except for the fact that earlier testing (using Cal Trig Random Test)
showed that BOTH of these IMLRO's were generating "bit value 4" errors.

It is hard to argue that the problem is with BBB's or M114 backplane or
COMINT, because these things are common between lower and upper And-Or Term
IMLRO's.  All 4 "And-Or Cells" use FW MTG's.  Are there two independent
problems, one in each of the affected cells, which both happen to affect bit
value 4?  For example, 2 screwed-up (Electrocircuits) And-Or cards which are
loading the CBUS?

We should run more tests:

    - make a new front-panel CBUS cable for one (or both) of the affected
      cells, which bypass the And-Or cards (i.e. only touch MBD, IML, and
      IMLRO).

    - look at DC voltages (and also resistances) on the front-panel CBUS
      cables

Note: How to use the new "read-histogram" feature:
    - The message isn't yet part of $ TRICS_ACCESS, use instead
        $ @EENV:COMMANDS
        $ PHAT READHIST cbus mba ca fa sample size
    - You need to turn INFO messages ON to display the full histogram
        (on the remote console or the logfile)
        use $ TRICS_ACCESS or $ PHAT TRC_INFO 1 1
    - The string returned in the reply message shows the peak of the
      histogram with its bin content. There is a bug in the code that limits
      the number of characters for bin content to 3 characters (there is room
      for 4 characters, and will be fixed in the next system).
    - Turn INFO back OFF when you are done
        use $ TRICS_ACCESS or $ PHAT TRC_INFO 0 0

    There also is another new message for displaying a register
characteristics as stored in the database.
    - use $ PHAT SHOW_REG cbus mba ca fa
    - The string returned in the reply message shows what was last read by TRICS
        (e.g. read-back after the last write).
    - Again, most of the data is displayed in INFO messages.
    - example of usefulness: show what TRICS thinks the R/W masks are for a MTG
        PAL, and how it is programmed.
    - Pipeline registers are stored by their lowest fa.
..............................................................................

Date: 12-OCT-1994   At: D0 Hall  Topics: ECB meeting at NWA, IMLRO readout
                                         testing, look at CALIB EXAMINEs
                                         to see pedestal values

We continue testing IMLRO readout of Large Tile And-Or Input Terms.  We
set all Large Tile And-Or Input Terms to true (by defining 0 MeV Large
Tile Ref Sets) and capture Data Blocks via TRGMON (using the spy window in
Hex mode looking at item #395-398 and 426-429).  We saw the standard
"bit of value 4" problems at low rate.  Dan also installed (temporarily)
a version of VTC which sensed this problem and displayed error messages.
We saw T1, T2, T3, and T4 error messages (i.e. all possible error messages).

Note that the And-Or Input Terms were set "DC high" in this test, which
tends NOT to imply timing problems with IML inputs.

We also triggered on one of the "bit 4" And-Or Input Terms, to get an idea
whether this problem is readout-only or also affects triggering.  The And-Or
rate appeared steady, which points to a readout-only problem.

We re-installed the "normal" VTC code.

Joan showed us CALIB examine plots which showed a few Trigger Towers with
funny pedestals:

    -12, 16 both EM and HD had an average pedestal of 6.9
    -14,  7 HD only        had an average pedestal of 9.0

We looked at these with TRGMON with no beam in the machine and saw the same
thing.  We should re-evaluate Pedestals when we have an opportunity.
..............................................................................

Date: 11-OCT-1994   At: MSU  Check operation of +17,15 EM and HD.

Dan Owen checks the Calorimeter Examine output looking for anything funny with
+17,15 and finds nothing.
..............................................................................

Date: 10-OCT-1994   At: MSU  Topics: Pedestal DAC values for +17,15 EM and HD.

The zero energy response from TT's +17,15 EM and +17,15 HD had shifted down
from 8 to 7.  It is not clear why both halves of +17,15 shifted.  Is the BLS
disconnected or its big hybrid dead??   Anyway to get the zero energy response
back to 8 the  Init_DAC_Bytes.LSM  was hand edited.   The DAC pedestal value
for +17,15 EM moved from 31 to 33.  The DAC pedestal value for +17,15 HD moved
from 28 to 30.  This new Init_DAC_Bytes.LSM  file is in the TrgCur: and
D0HTCC:: at D0 and at MSU::[Trg_Current.DZero].
..............................................................................

Date:  6,7-OCT-1994   At: Fermi  Topics: Work with Philippe running Cal Trig
                                         Tests:  -17,9 EM Ref Set #3 stuck on
                                         problem and EM Ref Set #0 skiping +-2
                                         problem,  reading And-Or Terms from
                                         Large Tile Triggers, ECB meeting with
                                         Dave Buchholz

Work running Cal Trig tests. The  problem with the comparator being stuck high
for EM Ref Set #3 at Trigger Tower  -17,9 ended up being that it was shorted to
the Tot Et comparator for Tot Et Ref Set #0.  Pins #53 and #85 were shorted
together on the backplane by the metal clip in the 3M connector on the cable
that leads to the ERPB.  I pulled off the cable and cut the metal clip out of
the connector.  I did not try to clip off the backplane pins any more.  When
the TTL high and the TTL low fought the CHTCR saw a TTL high.  This makes sense
because there are 100 ohm resistors between the opposite TTL outputs and the
CHTCR was effectively looking at the middle of these two 100 ohm resistors.

Load and run newest version of the Cal Trig Test code that checks the large
tiles (EWORK1:TRICS_V63.SYS_6OCT94;1).  Today the +/- 2 global count problem
associated with the count in EM Ref Set #0 appears at high rate.  We skip over
all EM Ref Sets (test Tot RefSets only) so that we can push on the Large Tile
tests.  At first it shows about one readout error of Large Tile And-Or Terms in
30k loops.  Then with the Monitor Pool refresh shut off it starts running 100k
loops between errors (but Philippe forgot to read the Andor Cards to see if the
symptoms were still the same, we only know that redoing the same loop didn't
repeat the error). A new version of the code was made but not loaded
(EWORK1:TRICS_V63.SYS_6OCT94;2) that reads the IMLRO card registers twice in a
row, and also reads the other IMLRO and reports an error when it finds a
discrepancy.
111513: Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
206896: Large Tile Count is 4 but REF_2 LT Andor Terms .GE. 1,2,3 are 0,1,1

The new version of the system (i.e. its test code) seemed to interfere with the
framework (the "watch double buffer" process kept resynchronizing the pipes)
and the older version (ETRICS:TRICS_V62.SYS_7SEP94) of the system was reloaded.

Now back to the EM Ref Set #0 skipping around by +-2 problem.  Philippe and
Steve track this down to eta +13:+16  phi 1:8.  Pushing on the CHTCR and its
cables does not help.  Pushing on the proper EM Ref Set #0 Tier 2 CAT2 card
did help.  The operand #1 side of this card (not the CBus side) was push back
into the card file.  It is possible that I actually detected a small movement
into the card file when this side of this card was pushed on.  Now is the CBus
still properly connected?

Check run (82 kloops) of random test with no errors before giving system back.

Evening and more loops of old TRICS running in eta 1:16.  No errors
E-HTT/PAR%rand% Loop 675000/1000000, Error Count is 0

Later in the evening and now more tests using the newest version of TRICS
(EWORK1:TRICS_V63.SYS_6OCT94;2).  At first we saw errors every 5 or 10 k loops.
then it ran for about 60k loops with no errors.  Then more errors again.  This
was a special version of TRICS Cal Trig test that read the AND-OR IMLRO card
twice and also read the second (Spec Trig 16:31) AND-OR IMLRO card two times.
During all of this set of test the Monitor Pool Refresh was turned off.

All possible combinations of errors were seen  e.g. the 1st read could be OK
and the 2nd read be wrong, or the 1st read could be wrong and the second read
OK,  both reads from the Spec Trig 0:15 IMLRO could be OK and one of the reads
from the Spec Trig 16:31 IMLRO could be wrong.

M101 1st read bad, 2nd read bad, M102 read bad         0
M101 1st read bad, 2nd read bad, M102 read  ok         0
M101 1st read bad, 2nd read  ok, M102 read bad         0
M101 1st read bad, 2nd read  ok, M102 read  ok        22
M101 1st read  ok, 2nd read bad, M102 read bad         3
M101 1st read  ok, 2nd read bad, M102 read  ok        29
M101 1st read  ok, 2nd read  ok, M102 read bad        17
M101 1st read  ok, 2nd read  ok, M102 read  ok    225900

After a while, we notice that all of the read errors (from either the  Spec
Trig 0:16 IMLRO or the Spec Trig 16:31 IMLRO)  had the bit of value 4 wrong
(always dropped, never added). The vast majority of the errors came from FA=13
although some of the other FA's were seen at a considerably lower rate (i.e.
FA=11,12).   This maps onto the andor terms for LT_Ref_Set#7 >= 2 (more than 50
times),  LT_Ref_Set#4 >= 3 (seen 5 times), and  LT_Ref_Set#2 >= 1 (seen 4
times). Is this a data path problem in M114  e.g. epoxy ???

We can also think about the sporadic background problem appearing during boot
or initialize of the style "Previously 11 instead of 15 @ cbus 2 mba 129 ca 9 fa
3" (as appearing in this logfile boot sequence). Different ca and fa have
been seen, and it was believed to only be a read-back problem. But this is also
pointing to the bit of value 4, in the same backplane.

Philippe tried locking the Read and Write A/B pipe control lines to the L1 FW
both high and low.  This did not appear to help or hurt.  At the end of these
tests we returned to the old version of TRICS.
..............................................................................

Date:  5-OCT-1994   At: Fermi  Topics: Note to Steve Pier, Visit ANL

Sent note to Steve Pier at UCI asking about how to speed up the LATCH and
XMIT_TRIG operation of ERPB - DC.

T2 Supervisor meeting at ANL with L. Price, J. Dawson, R. Blair and
T. Fuess of ANL, A. Lankford, S. Pier and Birgit at UCI.
..............................................................................

Date:  4-OCT-1994   At: Fermi  Topics: Repair +16,9 HD,  Work with Al Ito,
                               work with Joey Thompson,  Jan says there is
                               a noisy Trig Tower,  Change the Init_DAC_Bytes
                               DAC value for +20,23 EM,  Repair PDM-14.

The most recent Dan Owen Pulser run showed that +16,9 HD was at 400%.  The
problem was a cold solder joint in the Term-Attn.  Replaced the Term-Attn.

Al Ito is worried about "Latch Efficiency".   He is using And-Or Input Terms
#84:#87 which are called SCT0:SCT3.  I had him put the same signal into all
of them and I "programmed" 4 Specific Triggers to look at them with just the
AND-OR rate scalers. All looked OK.  Then he turned on his HV and things looked
funny.  He will look more at his stuff.  He still has logic analyzers and junk
stacked up in front of the L1 FW.

Joey Thompson wants a NIM input to the And-Or Network to test some scintalator
that looks at timing of A layer muons.  There appear to be some open unused
inputs in the NIM to ECL converter in the Bagby-Norm_Amos rack.  He is going to
use either the 3rd or 4th input from the top which is Term #122 or #123.  Note
that these still have old veto scheme type names in the  Trig_Config.CTL  file.

Jan Guida reports that Trigger Tower  +6,23 EM  has been Excluded (or else
walked around with the Ref Sets, I'm not sure which).  It is most likely
noisy HV but so far there has not been a quiet enough run to see it in the
precision readout.  I need to find out more about this.  Joan is at COMO this
week.  Note that +6,23 EM has been excluded before (e.g. around 1-APR-1994)

Trig Tower +20,23 EM (which does not exist) has been causing trouble because
its zero energy response has been 9 instead of 8.  This causes the Cal Examine
EM Trig Tower plot to always have a spike at +20,23 and this compresses the
vertical scale.  By hand I edited  Init_DAC_Bytes.LSM  to move the DAC value
for this channel from 25 to 23.  The hand done histogram was  25-->9, 24-->8,
23-->8, 22-->7, 21-->7. Copied the new Init_DAC_Bytes.LSM to MSU::TrgCur.DZero.

Repaired PDM-14 which failed 2-OCT-1994.  Removed the failed -4.5V brick MSU
SN#95 and replaced it with MSU SN#40.  Tested all 4 supplies with the toaster.
A good place to find 208V 3 phase is outside of the main machine shop on the
high bay floor.  This outlet should be far away from any power distribution
panels that support the running experiment.  PDM-14 is now stored at D-Zero
Hall as a tested spare.
..............................................................................

Date:  2-OCT-1994   At: MSU-Fermi  Topics: M107 Upper Tier 1 Power Pan Fails

After about 7 hours of Tevatron beam (the first in over 1 month) the M107
Upper Tier 1 Power Pan  -4.5 Volt brick fails.  This is PDM-14 and the -4.5
Volt brick is SN#95 which has only been operating since June 1994.  The
output from this brick fell to about only -3 Volts.  Power cycling restored
the supply to normal operation for about 1/2 hour then it died again.  They
paged Dan Owen and he came in on his way back from a Cub Scout camp out.

Dan Owen at Fermi managed/did the replacement without the elevator.  Things
started up again OK once this supply was replaced.
..............................................................................

Date: 23-SEP-1994 from: MSU    Topics: Remote diagnostics of CalTrig CHTCR

    - temporary load trics v6.3 to test new additions made to random test to
      check the loarge tiles up to the andor network.

    - This version also has additional "progress report" messages in CHTCR and
      CTFE lookup PROM tests to show wich PROM is being checked.
      the CTFE PROM test spends 56.15 s/page for 64 towers (but the remote
      console was on [at MSU!], and this may slow things down) which is about 20
      % slower (was 23.7 s/page for 32 towers). This test should be redone
      without the remote console to decide if this progress report is worth the
      slow down.

    - The initial values in random test are hard to understand (cf. below).
      This is after 0 loops on all [1..20] towers to initialize it all, and
      starting on a test for eta [1..16]. The 0 loops on [1..20] initialized all
      towers with 0 counts of ADC, and 0 counts of Threshold.

      If one ignores the EM REF_3 entries, the rest is explainable.

      The Tot Et Reference Sets see the ouptput of the EM+HD PROMs.
      On page 8, we see that the random test was expecting 2*16*32=1024 towers
      and finds 2*16*32=1280 towers, i.e missing 256 = 2*4 eta rings.
      Note that the pattern is symetric around the "center page" #4
      The difference is         704  - 544 = 160   5 eta rings  for page #1 & 7
                                1152 - 896 = 256   8 eta rings            2 & 6
                                1120 - 864 = 256   8 eta rings            3 & 5
                                1152 - 896 = 256   8 eta rings            4
      For pages 2 - 6, we are just missing the contribution of eta 17..20.
      for pages 1 & 7 it seems that "some but not all" of the misisng PROMs have
      a response of 0 for an input of 0. This is something to expect, as it was
      the method to minimize the zero energy response (no "negative saturation"
      for the steepest prom page slope). Page 1 or page 7 has the greater PROM
      slope depending on the sign of eta. We should thus expect half of these
      PROMs to have an output=0 when input=0. The 5 eta rings, instead of 4,
      comes from the EM eta 20 PROMS which have a constant output of 8, and the
      EM+HD sum can thus not be 0. So for page 1 or page 7, the towers at + or -
      17..19 are supposed to be 0 for an ADC count of 0.

E-HRD/TST%rand% Global TOT Twr Count REF_0 Page#1 is 704 instead of 544                             %% time: 23-SEP-1994 12:59:10.21
E-HRD/TST%rand% Global TOT Twr Count REF_1 Page#1 is 704 instead of 544                             %% time: 23-SEP-1994 13:00:27.87
E-HRD/TST%rand% Global TOT Twr Count REF_2 Page#1 is 704 instead of 544                             %% time: 23-SEP-1994 13:00:27.97
E-HRD/TST%rand% Global TOT Twr Count REF_3 Page#1 is 704 instead of 544                             %% time: 23-SEP-1994 13:00:28.07
E-HRD/TST%rand% Global TOT Twr Count REF_0 Page#2 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:28.95
E-HRD/TST%rand% Global TOT Twr Count REF_1 Page#2 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:29.04
E-HRD/TST%rand% Global TOT Twr Count REF_2 Page#2 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:29.14
E-HRD/TST%rand% Global EM Twr Count REF_3 Page#2 is 33 instead of 32                                %% time: 23-SEP-1994 13:00:29.23
E-HRD/TST%rand% Global TOT Twr Count REF_3 Page#2 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:29.33
E-HRD/TST%rand% Global TOT Twr Count REF_0 Page#3 is 1120 instead of 864                            %% time: 23-SEP-1994 13:00:30.17
E-HRD/TST%rand% Global TOT Twr Count REF_1 Page#3 is 1120 instead of 864                            %% time: 23-SEP-1994 13:00:30.27
E-HRD/TST%rand% Global TOT Twr Count REF_2 Page#3 is 1120 instead of 864                            %% time: 23-SEP-1994 13:00:30.37
E-HRD/TST%rand% Global EM Twr Count REF_3 Page#3 is 33 instead of 32                                %% time: 23-SEP-1994 13:00:30.46
E-HRD/TST%rand% Global TOT Twr Count REF_3 Page#3 is 1120 instead of 864                            %% time: 23-SEP-1994 13:00:30.56
E-HRD/TST%rand% Global TOT Twr Count REF_0 Page#4 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:31.40
E-HRD/TST%rand% Global TOT Twr Count REF_1 Page#4 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:31.49
E-HRD/TST%rand% Global TOT Twr Count REF_2 Page#4 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:31.59
E-HRD/TST%rand% Global EM Twr Count REF_3 Page#4 is 1 instead of 0                                  %% time: 23-SEP-1994 13:00:31.68
E-HRD/TST%rand% Global TOT Twr Count REF_3 Page#4 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:31.78
E-HRD/TST%rand% Global TOT Twr Count REF_0 Page#5 is 1120 instead of 864                            %% time: 23-SEP-1994 13:00:32.62
E-HRD/TST%rand% Global TOT Twr Count REF_1 Page#5 is 1120 instead of 864                            %% time: 23-SEP-1994 13:00:32.72
E-HRD/TST%rand% Global TOT Twr Count REF_2 Page#5 is 1120 instead of 864                            %% time: 23-SEP-1994 13:00:32.81
E-HRD/TST%rand% Global EM Twr Count REF_3 Page#5 is 33 instead of 32                                %% time: 23-SEP-1994 13:00:32.91
E-HRD/TST%rand% Global TOT Twr Count REF_3 Page#5 is 1120 instead of 864                            %% time: 23-SEP-1994 13:00:33.01
E-HRD/TST%rand% Global TOT Twr Count REF_0 Page#6 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:33.88
E-HRD/TST%rand% Global TOT Twr Count REF_1 Page#6 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:33.98
E-HRD/TST%rand% Global TOT Twr Count REF_2 Page#6 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:34.08
E-HRD/TST%rand% Global EM Twr Count REF_3 Page#6 is 33 instead of 32                                %% time: 23-SEP-1994 13:00:34.17
E-HRD/TST%rand% Global TOT Twr Count REF_3 Page#6 is 1152 instead of 896                            %% time: 23-SEP-1994 13:00:34.27
E-HRD/TST%rand% Global TOT Twr Count REF_0 Page#7 is 704 instead of 544                             %% time: 23-SEP-1994 13:00:35.03
E-HRD/TST%rand% Global TOT Twr Count REF_1 Page#7 is 704 instead of 544                             %% time: 23-SEP-1994 13:00:35.13
E-HRD/TST%rand% Global TOT Twr Count REF_2 Page#7 is 704 instead of 544                             %% time: 23-SEP-1994 13:00:35.22
E-HRD/TST%rand% Global EM Twr Count REF_3 Page#7 is 33 instead of 32                                %% time: 23-SEP-1994 13:00:35.32
E-HRD/TST%rand% Global TOT Twr Count REF_3 Page#7 is 704 instead of 544                             %% time: 23-SEP-1994 13:00:35.41
E-HRD/TST%rand% Global TOT Twr Count REF_0 Page#8 is 1280 instead of 1024                           %% time: 23-SEP-1994 13:00:36.95
E-HRD/TST%rand% Global TOT Twr Count REF_1 Page#8 is 1280 instead of 1024                           %% time: 23-SEP-1994 13:00:37.05
E-HRD/TST%rand% Global TOT Twr Count REF_2 Page#8 is 1280 instead of 1024                           %% time: 23-SEP-1994 13:00:37.15
E-HRD/TST%rand% Global EM Twr Count REF_3 Page#8 is 1 instead of 0                                  %% time: 23-SEP-1994 13:00:37.27
E-HRD/TST%rand% Global TOT Twr Count REF_3 Page#8 is 1280 instead of 1024                           %% time: 23-SEP-1994 13:00:37.36


    To explain the initial values for Tot Et Tower Counts, look at this summary
of zero input response of the PROMs

  page#        1    2    3    4    5    6    7    8
PROM
  EMP0101     02   01   00   00   00   01   02   00
  EMP0201     03   02   01   00   00   00   01   00
  EMP0301     04   03   02   01   00   00   00   00
  EMP0401     05   04   03   02   01   00   00   00
  EMP0501     06   05   04   03   02   01   00   00
  EMP0601     06   06   05   04   03   02   00   00
  EMP0701     06   05   05   04   03   02   00   00
  EMP0801     06   05   05   04   03   02   00   00
  EMP0901     07   07   06   05   04   03   00   00
  EMP1001     07   07   06   05   04   03   00   00
  EMP1101     07   07   06   05   04   02   00   00
  EMP1201     08   08   07   06   05   03   00   00
  EMP1301     08   08   07   06   05   03   00   00
  EMP1401     08   08   07   06   05   03   00   00
  EMP1501     08   08   07   06   05   03   00   00
  EMP1601     08   08   07   06   05   03   00   00
  EMP1701     08   08   07   06   05   03   00   00
  EMP1801     08   08   07   06   05   03   00   00
  EMP1901     08   08   07   06   05   03   00   00
  EMP2001     08   08   08   08   08   08   08   00

      EM       0    0    1    2    3    3   17   20  zeroes
  Pos+Neg     17    3    4    4    4    3   17   40

  HDP0101     02   01   00   00   00   00   01   08
  HDP0201     02   01   00   00   00   00   00   08
  HDP0301     03   02   02   01   00   00   00   09
  HDP0401     04   03   03   02   01   01   00   0A
  HDP0501     05   04   04   03   02   01   00   0B
  HDP0601     04   03   03   02   01   01   00   0A
  HDP0701     06   05   05   04   03   02   01   0C
  HDP0801     06   05   05   04   03   02   01   0C
  HDP0901     07   06   06   05   04   03   01   0D
  HDP1001     07   06   06   05   04   03   01   0D
  HDP1101     07   06   06   05   04   03   01   0D
  HDP1201     06   05   05   04   03   02   00   0C
  HDP1301     06   05   05   04   03   02   00   0C
  HDP1401     08   07   07   06   05   04   01   0E
  HDP1501     08   07   07   06   05   04   01   0E
  HDP1601     08   07   07   06   05   04   01   0E
  HDP1701     08   07   07   06   05   04   01   0E
  HDP1801     08   07   07   06   05   04   01   0E
  HDP1901     08   07   07   06   05   04   01   0E
  HDP2001     08   07   07   06   05   04   01   0E

  (EM+HD)/2    0    0    2    2    3    4   18    0  zeroes
  Pos+Neg     18    4    5    4    5    4   18    0  zereos
              22   36   35   36   35   36   22   40  non-zereos
   *32 phis  704 1152 1120 1152 1120 1152  704 1280


    How to explain the initial counts on the EM Tower counts:
For getting an EM tower ebove threshold, we must have EM > 0 and HD = 0
This never happens on page 4. For page 7 this happens at eta = +2, page 6 at +1,
page 3 at +2, and by extrapolation page 1 at -2, page 2 at -1, page 5 at -2.

    There is a systematic offset of 1 count in the EM ref set #3, for all pages
except page #1. One can assume that it comes from -17;9 as seen on 22-sep. Page
#1 is the only page where the EM zero input response is 0, the HD response
is always > 0.

    Possible explanation: This behaves as if the HD veto was not kicking in (on
any page) and the tower passes its threshold as soon as EM = 0, with no regard
to HD >0.

    - first attempts at running new test weren't very successful because of a
      bug that would prevent reporting errors for large tile andor terms.
      144,000 loops were performed on eta 1..16 with no other error.

    - after fixing code bug, run for 839902 loops. 27 large tile errors, no
      other error detected. (log file is TRICS_23SEP94.LOG;3)

28581/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 11,HD,POS,E_4,P_11,LUP_3-8-8-8,EMET_REF,REF_2,42,CMP_2
53283/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 37,HD,NEG,E_9,P_12,LUP_3-4-6-8,EMET_REF,REF_3,143,CMP_1
110355/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 189,HD,NEG,E_4,P_29,LUP_7-7-8-8,EMET_REF,REF_0,39,CMP_1
133716/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 10,EM,NEG,E_1,P_21,LUP_5-1-8-8,TOTET_REF,REF_0,82,CMP_2
166750/1000000,
Large Tile Count is 3 but REF_2 LT Andor Terms .GE. 1,2,3 are 0,1,1
Pick was 228,EM,POS,E_7,P_30,LUP_8-6-4-8,EMET_REF,REF_1,42,CMP_0
187046/1000000,
Large Tile Count is 5 but REF_4 LT Andor Terms .GE. 1,2,3 are 1,1,0
Pick was 46,HD,NEG,E_13,P_31,LUP_4-2-6-8,EMET_REF,REF_0,6,CMP_2
230247
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 185,EM,NEG,E_2,P_6,LUP_5-3-4-8,TOTET_REF,REF_1,119,CMP_2
248206
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 241,EM,NEG,E_3,P_18,LUP_3-5-1-8,TOTET_REF,REF_1,244,CMP_1
292822/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 100,EM,POS,E_6,P_18,LUP_6-8-3-8,EMET_REF,REF_2,39,CMP_0
345197/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 49,HD,POS,E_3,P_12,LUP_8-8-3-8,EMET_REF,REF_3,176,CMP_2
375297/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 122,HD,NEG,E_3,P_14,LUP_8-5-3-8,EMET_REF,REF_1,220,CMP_0
375533/1000000,
Large Tile Count is 3 but REF_2 LT Andor Terms .GE. 1,2,3 are 0,1,1
Pick was 13,EM,NEG,E_16,P_20,LUP_7-6-8-8,HDET_VETO,REF_2,28,CMP_3
422227/1000000,
Large Tile Count is 3 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,1
Pick was 19,HD,NEG,E_11,P_27,LUP_4-8-4-8,EMET_REF,REF_1,148,CMP_0
519607/1000000,
Large Tile Count is 4 but REF_4 LT Andor Terms .GE. 1,2,3 are 1,1,0
Pick was 82,HD,NEG,E_14,P_26,LUP_8-2-1-8,EMET_REF,REF_0,168,CMP_1
536559/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 57,HD,NEG,E_13,P_31,LUP_3-1-5-8,TOTET_REF,REF_2,125,CMP_2
538971/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 120,HD,POS,E_14,P_5,LUP_1-1-3-8,HDET_VETO,REF_2,147,CMP_1
584306/1000000,
Large Tile Count is 4 but REF_4 LT Andor Terms .GE. 1,2,3 are 1,1,0
Pick was 188,HD,POS,E_11,P_2,LUP_6-4-3-8,HDET_VETO,REF_3,186,CMP_2
599448/1000000,
Large Tile Count is 4 but REF_2 LT Andor Terms .GE. 1,2,3 are 0,1,1
Pick was 106,EM,POS,E_12,P_27,LUP_1-8-2-8,TOTET_REF,REF_0,167,CMP_1
644079/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 118,HD,NEG,E_6,P_25,LUP_1-8-7-8,EMET_REF,REF_3,235,CMP_2
647084/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 245,HD,NEG,E_12,P_15,LUP_6-6-5-8,TOTET_REF,REF_0,65,CMP_0
722884/1000000,
Large Tile Count is 3 but REF_2 LT Andor Terms .GE. 1,2,3 are 0,1,1
Pick was 143,EM,NEG,E_16,P_21,LUP_2-4-4-8,TOTET_REF,REF_3,38,CMP_3
734397/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 132,EM,NEG,E_2,P_24,LUP_4-7-7-8,EMET_REF,REF_1,201,CMP_0
742611/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 239,EM,NEG,E_12,P_10,LUP_3-5-3-8,TOTET_REF,REF_3,99,CMP_2
757132/1000000,
Large Tile Count is 3 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,1
ick was 61,HD,NEG,E_16,P_23,LUP_8-2-3-8,TOTET_REF,REF_1,218,CMP_0
780123/1000000,
Large Tile Count is 3 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,1
Pick was 167,HD,POS,E_12,P_18,LUP_7-6-1-8,TOTET_REF,REF_2,165,CMP_0
798460/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 236,HD,POS,E_12,P_30,LUP_8-6-6-8,TOTET_REF,REF_0,229,CMP_0
839902/1000000,
Large Tile Count is 2 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,0
Pick was 210,HD,NEG,E_16,P_2,LUP_5-7-5-8,EMET_REF,REF_3,163,CMP_1

    In all cases the error cannot be reproduced by repeating the same loop. We
have an intermittent problem with the large tiles, or with the test.

    Made the following TRICS command files (backup on MSU::[TRG_CURRENT.DZERO]
only since the usefulness of these files is only a temporary)

  READ_LARGE_TILE.TCC             read LgTile pattern on tier #2 LTCC cards
  READ_LARGE_TILE_ANDOR.TCC       read LgTile Andor terms on FW IMLRO
  CYCLE_TSS_LGT_T3.TCC            cycle the TSS to latch the Tier #2 LTCC card
  CYCLE_TSS_LGT_T2.TCC            cycle the TSS to latch the Tier #3 LTCC card
  CYCLE_LGT_BY_HAND.TCC           recursive call to previous files

    The LgTile Pattern shows nothing peculiar and all reference sets agree.
    Reading the Andor Terms did not show the problem, i.e. the andor terms were
    now fine, even without single cycling the caltrig again. At this time, the
    CT MTG was "parked" and the FW MTG was free running.

    Made another version of the system which makes the FW MTG also single cycle.
    (log file is TRICS_23SEP94.LOG;5).
    I only had time to catch 1 error before returning the system to Jan et al.
        39358/100000,
        Large Tile Count is 3 but REF_7 LT Andor Terms .GE. 1,2,3 are 1,0,1
        Pick was 164,EM,POS,E_7,P_10,LUP_8-7-7-8,HDET_VETO,REF_2,158,CMP_2
    and the symtoms were identical:
        no problem seen on the LgTile pattern
        no problem seen on the andor cards
        problem did not repeat upon redoing same loop
    Was the code really single cycling the FW MTG : I think so, but I didn't
    have time to really prove it to myself.
    Is this a problem of flaky reading of the IMLRO?

    - at the end of all this, the older system V6.2 was reloaded,
      as the V6.3 diagnostics code is not yet stable/useful.
..............................................................................

Date: 21-22-SEP-1994 from: MSU    Topics: Remote diagnostics of CalTrig CHTCR:
                                          all CHTCRs check out,
                                          2 * 100 k loops of random test ok.

    - Philippe runs CHTCR test on full coverage eta +/- 1..20. No errors.
      Method: Use INITIAL TRGTWR before testing each card
        1) use initialize Trigger Tower to clean up the state left by the test.
        2) select the first CHTCR for PROM test, run the test
        3) go back to (1) for next CHTCR

      Initializing the towers with 0 loops of random test (as described in entry
      from 22-DEC-1993 in D0_HALL_LOGBOOK.LBK_1993) didn't work:
        - tried testing the CHTCR at either + or - 1..4;1..8
        - everything is fine thru EM Ref Set # 0, 1, 2
        - reaching EM Ref Set #3, the initial global count is 1 instead of 0.
          Using Tree Browser, I found the culprit at EM tower -17;9.
          The 0 loops of random test left all thresholds (em, hd_veto and tot)
          programmed with zero, and all towers are in simu mode, with simu ADC=0
          The answer on all EM Ref Sets should be 0 because
                for eta 1..2, the zero input response of the EM PROM is 0
                for eta 3..20, the zero input response of the HD PROM is > 0.
          I tried releasing the CT MTG, and re-aiming the read/write pipes, then
          changing the threshold to see if I could shut it off at all. I didn't
          succeed in seeing any change in this tower. But I couldn't change its
          neighbors either. I don't know what I was doing wrong. Correction:
          I didn't take into account the HD veto. This needs to be investigated.
          Note that TRGMON shows he EM#3 count at 0 after initialize.
        - All Tot Et Ref Sets initial counts were 1136 instead of 0.
          I believe 1136 can be derived from the 1152 initial count (cf entry
          for 23-sep) and removing the contribution of the 4*8 towers in the
          CHTCR cell being tested. but 16 of the 32 towers (eta = 1,2) in this
          range were already not contributing.
          This is systematic across all 4 tot et refsets, and is a new(?)
          effect since we changed the HD PROMs for L1.5 CT EM lookup. This
          change made the CHTCR test use page #4 where it was page 8 before
          (since HD page 8 is now constant >0)


    - Run 2 x 100k loops of random test. No errors.
..............................................................................

Date: 16-SEPT-1994  At: Fermi  Topics: Deliver spare CRC, 68k_Services ABS
                                       files, Tests of L1 Cal Trig.

Deliver the 3rd L15CT CRC card to D0 Hall to be kept here as a spare.  It is
CRC serial number SN#1.

The 68k_Services files available in TrgCur: are   L15CT_68K_Services.Abs
and  L15CT_68K_Services.Abs_Panic.   I left the "Panic" mode code loaded.
These are the most recent versions, i.e. 4 Terms and not confused by $ff's
from TCC.

Testing of the L1 Cal Trig
--------------------------
Started by running Caltrg_Random in the eta range 1:16.  I tried to make some
runs of 50k loops.  In the beginning I had the "normal" problem of Global EM
Tower Count Ref Set 0 dancing around +-2 of the proper value in the tower
count range of 140 or 150.  Then things settled down and it was running 50k
loops OK.  This is eta 1:16, all Pages, all Ref Sets.  I also tried some 50k
loop runs with just Page 4.  Next I tried 1,000,000 loops all Pages all Ref
Sets.  This maded it to:

E-HTT/PAR%rand% Loop 896000/1000000, Error Count is 0

At which point it was stuck in some funny repeating +-4 problem of:

Global HD 1st Energy Sum is 66155 instead of 66159, T1 trunc =65536
Global HD 1st Energy Sum is 66380 instead of 66376, T1 trunc =65536
Global HD 1st Energy Sum is 62318 instead of 62314, T1 trunc =69632
Global HD 2nd Energy Sum is 62436 instead of 62432, T1 trunc =69632
Global HD 2nd Energy Sum is 66139 instead of 66135, T1 trunc =65536

After starting back up, everything looked fine and it made it through 31k looks
of a 50k loop run with no errors at which point the system was Initialized out
from under it.

Next I started the memory test scan of eta 1:16 all Pages.
Test type is Lookup, both signs of eta 1:16, all phi's, Both channel types,
All Pages, 1 loop, normal I/O.  This started at 20:52 and ran until 22:43
with no errors shown.

Update 28-SEP-1994 : Philippe looks at log files
    Notice that the EM Ref Set 0 +/- 2 problem is all happening when the tower
is picked in the +13..16;1..8 range, except for the first occurence.
This granularity maps onto a Tier #1 CHTCR or a Tier #2 CAT2 input.

    HD,NEG,E_11,P_9
    HD,POS,E_16,P_7
    HD,POS,E_14,P_6
    HD,POS,E_14,P_6
    HD,POS,E_16,P_7
    EM,POS,E_16,P_2
    EM,POS,E_15,P_8
    HD,POS,E_13,P_6
    HD,POS,E_16,P_3
    HD,POS,E_16,P_6
    HD,POS,E_16,P_3
    EM,POS,E_14,P_8
    HD,POS,E_13,P_1
    HD,POS,E_13,P_6
    HD,POS,E_16,P_6
    HD,POS,E_14,P_6

    The HD Problem seems to be located on a CTFE card:
HD Tier 1 CAT2 operand for NEG,E_1_4,P_1 Page#2 is 569 instead of 573
The problem is either intermittent (some days yes, some days not) or needs a
very unlikely combination of trigger tower energies (it took 900kloops)
It is also flaky once the system gets in screwed up mode behind by 4 (it may
give the correct answer, appearing as ahead by 4, but redoing same loop goes
back to bad, i.e. no error)
..............................................................................

Date: 7,8,9-SEPT-1994  At: Fermi  Topics:Group Meeting, Install the last 13
                                  L15 FW Term Receiver PAL's,  Swap the IML
                                  cards for L1 FW Input Terms 128:255, Install
                                  new version of TRICS, Install new version of
                                  L15CT 68k_Ser, Tests of L15CT,  Recorded
                                  cosmic run with L15CT filtering 4 L1 SpTrgs.

Cooked ans installed the last 13 L15 FW Term Receiver PAL's.  I cooked two
spare parts and put them in the programmed parts container. The UniSite model
48 at D-Zero is dead again. I let Sten Hansen know and used a cooker in the
9th floor WH.

While power was off I swapped the last 2 IML's for non "H" IML's.  The first
two were finally swapped a couple of months ago.  This time I swapped the ones
for L1 Terms 128:255.  In M101 pull SN#11 install SN#13.  In M102 pull SN#14
and install SN#16.

Start running TRICS Version 6.2  7SEP94.  Also start running the newest 68k_Ser
code that is protected from having its Software Flags overwritten when TRICS
loads the parameter block into the 68k's "dual port" memory area.

L15CT Tests
-----------
Tested L15CT (using the simple CAL15CT trigger configuration file) on all 4
of its L15 FW Terms (i.e. #16:#19).  Did this by editing the VWork1: temporary
command files  L15CT_ALL.COM  and  L15CT_CMD.COM  and by moving the L15 FW
Term with TRICS.

Verified that L15CT worked with parameter values of "0" and "2" assigned to the
Global DSP.

Made a recorded test run of L15CT filtering on 4 L1 Spec Trig's.  This is just
a "cosmic + noise" run.  It is Run Number 83580.  The file is
DATA3:[CAL]EMC_083580_01.X_ZRD01;1 271038 blocks.  This is about 1000 events.

I worked with Bill Cobau's  MULTI_CAL15  trigger configuration to get something
that would run without beam  i.e. just with noise and cosmics.

The following is the basic setup as used in this run:

 L1 Spec Trig
  Numb  Name       L1 Conditions              L15CT Conditions
 ------------    -----------------   ----------------------------------------
  7  EM_1_Max    1 EM > 2.5 GeV      1 elec, 1x2 EM Sum > 5.5 GeV  Iso > 0.80
  8  EM_2_Med    2 EM > 0.5 GeV      2 elec, 1x2 EM Sum > 1.0 GeV  Iso > 0.10
 10  EM_1_Miss   1 EM > 2.5 GeV      1 elec, 1x2 EM Sum > 5.5 GeV  Iso > 0.10
                  + MsPt > 7.5 Gev
 12  EM_Jet      1 EM > 2.5 GeV      1 elec, 1x2 EM Sum > 5.5 GeV  Iso > 0.10
                  + 2 JT > 1.0 GeV

The following are the changes that I made on the fly to Bill's MULTI_CAL15
to get things running at a couple of Hz with reasonable L15CT rejection ratios.

1. Execute the Force L0 command file.

2. Change the following L1 Reference Sets:

    EM Ref Set #0   was 12.0 Gev over eta 1:19  set to 2.5 GeV  same eta
    EM Ref Set #1   was  2.5 Gev over eta 1:19  List Builder  no change
    EM Ref Set #2   was  7.0 Gev over eta 1:19  set to 0.5 GeV  same eta
    EM Ref Set #3   was 12.0 Gev over eta 1:13  set to 2.5 GeV  same eta

    Tot Ref Set #0   was  5.0 Gev over eta 1:20  set to 1.0 GeV  same eta
    Tot Ref Set #1   was  3.0 Gev over eta 1:20  List Builder  no change

3. Change the Level 1  Missing Pt Threshold #0 from  15.0 GeV to 7.5 GeV

4. Change all L1 prescales to 1.

5. Change the L15CT Term #0 EM Ref Set from  7.0 GeV  to  1.0 GeV  with the
   same eta 1:19 coverage  i.e. any EM Trig Tower over 1.0 GeV was a L15CT
   candidate.
6. Change the following L15CT Term Parameters:

             Cobau's original Beam Running L15CT setup

     L15 Term     1x2 EM Et       EM vs. Tot      Count           Used on L1
      Number      Threshold       Ratio Thresh    Threshold       Spec Trig's
     --------     ---------       ------------    ---------       -----------
          0        15.0 GeV           0.80            1                 7
          1        10.0 GeV           0.10            2                 8
          2        15.0 GeV           0.10            1             10, 12

            This Run's L15CT Setup for Cosmic and Noise Running

     L15 Term     1x2 EM Et       EM vs. Tot      Count           Used on L1
      Number      Threshold       Ratio Thresh    Threshold       Spec Trig's
     --------     ---------       ------------    ---------       -----------
          0         5.5 GeV           0.80            1                 7
          1         1.0 GeV           0.10            2                 8
          2         5.5 GeV           0.10            1             10, 12

7. The L15CT Mark and Force Pass Ratio was set to 25.

Making these changes resulted in the following typical performance with
just noise and cosmics in the calorimeters:

Global Monitoring of All Allocated Specific Triggers          9-SEP-94 12:42:06
                                             Integr Time DBSC/SBSC: 59.9/59.9 s
 Global Event Transfer Rate:      3.49 Hz  Level 1: Running  Information: Fresh
 Global Level 1 Trigger Rate:     7.51 Hz  Fast Level 0 Good:           0.00 Hz
 Level 1.5 Input/Reject:   6.9 Hz/58.1 %   Dead Beam X During Level 1.5:  0.1 %
 Time Since Last Initialize:  0 03:16:47   Events Transf Since:            8109
                                                      |Tot|Tot |Total|
Sp.|Firing| Andor|Prscl|L 1.5|Events|Globl|F-End|Level|And|Strt|Watch|
Trg|  Rate|  Rate|Ratio|Rejct|Transf|Expos| Busy|2 Dis|Trm|Dgtz|Busy |
---|----Hz|----Hz|-----|----%|------|----%|----%|----%|---|----|-----|-------
 0 |  0.58|286275| 500k|  0  |  6339|  0.0|  0.0|  0.2|  1|   1|    1|
 7 |  6.92|  6.92|    1| 58.1|  4665| 99.8|  0.0|  0.2|  8|   9|    9|L1.5
 8 |  0.65|  0.65|    1| 89.7|   468| 99.8|  0.0|  0.2|  7|   9|    9|L1.5
10 |  3.59|  3.59|    1| 40.0|  2272| 99.8|  0.0|  0.2|  9|   9|    9|L1.5
12 |  0.48|  0.48|    1| 51.7|   331| 99.8|  0.0|  0.2|  9|   9|    9|L1.5
30 |237962|238531|    1|  0  |******| 99.8|  0.0|  0.2|  4|   9|    9|
31 |  0.00|286275|    1|  0  |    67|  0  |  0.0|  0.2|  0|   1|    1|

This clip from TrgMon was taken during this run.  It shows the typical
L15CT rejections.  It is the average of 1 minute of running.  The run
actually lasted for about 5 minutes.
..............................................................................

Date:  24,25-AUG-1994  At: Fermi  Topics: ECB Meetings, New TCC code, New
                                  TRICS_Init_Auxi_L15CT.Dat, Start loading
                                  the DSP BLX files directly from TCC disk,
                                  Test the 4 Term L15CT

Moved to a new version of TRICS that accepts reentrant commands and that sends
us a mail message if it has to say  FAILURE BAD  at Init time.  If there are
any problems the backup path is to  TRICS_V62.SYS_19AUG94;4

We also changed the  TRICS_Init_Auxi_L15CT.Dat  file so that it does not say
anything about any L15CT parameters.  TRICS_Init_Auxi_L15CT.Dat  now depends
on the  L15CT_Default_Config.Dat  file to execute correctly.  Also
TRICS_Init_Auxi_L15CT.Dat  was changed to load the DSP's from TCC's disk.
The backup path for  TRICS_Init_Auxi_L15CT.Dat  file is in [TrgCur.Obsolete].
All of this new stuff appears to be working fine.

Start loading the DSP BLX files directly from TCC disk
------------------------------------------------------
This change was made because we have seen times when the network link to the
host was very slow or even timing out when TCC tried to get the BLX files
from the host.  So now it gets them from it disk from directory:
D0HTCC::DUA0:[L15CT$EXEC].

Where TCC will look for the DSP BLX files is controlled by what COOR
puts in its  LOADCODE  command.

What COOR puts in the  LOADCODE  command is controlled by what is in
the file  CTL:Trig_Config.Ctl.  Near the end of this file in the
Cal_L15_crate 0  Code_directory  statement it used to say  L15CT$EXEC:
today I changed this to  FROM_LOCAL_DISK.

Note that we have not errased the BLX files from the host so we can backup
to loading from there if necessary.

Test the 4 Term L15CT
---------------------
Between Stores I tested the various Terms of the 4 Term L15CT in the following
way.  I used our standard  Cal_Trig_L15  trigger configuration to setup the
COOR and L2 parts and to load up and get L15CT running on its Term #0  aka
L15 FW Term #16.  All of this went as normal and ran OK.

Then I paused the run and used TRICS to move to L15 FW Term #17 and
L15CT_Cmd.Com and L15CT_All.Com to move to L15CT Term #1.  This also looks OK.

I paused the run again and used TRICS to move to L15 FW Term #18 and
L15CT_Cmd.Com and L15CT_All.Com to move to L15CT Term #2.  This looks OK.

I had trouble when I tried to move to L15 FW Term #19.  TRICS has eyes !!
It looked into rack M103 and saw no L15 FW Term Receiver PAL for terms #19
and above and this caused it to answer BAD PARAM when I tried to setup on
L15CT TERM #3.  Anyway L15CT Terms #0, #1, and #2 look OK and I need to
cook more L15 FW Term Receiver PALs.
..............................................................................

Date:  23-AUG-1994  At: MSU  Topics: Remember where the Trigger programming
                                     can be found

The Postscript versions of Trigger Lists Versions 8.0 and above can
be found in the directory:

    D0$CONFIGS$TRIGLIST

Also in that directory is a "Trigger History" file (in ASCII), a Glossary
of Terms used in the Trigger Lists (also in ASCII), and "Strawman" versions
of Trigger Lists 8.0 (in Postscript).

Additionally, the Trigger Lists which are actually used to define the
running of D0 (i.e. the Trigger Lists ready by TRIGPARSE) for Global
Running in Run 1B are in the directory:

    D0$CONFIGS$RUN1B_GLB

The trigparse file and postscript file are in

        FNALD0::D0$L2BETA:[CONFIGS.RUN1B_GLB]
..............................................................................

Date:  22-AUG-1994  At: MSU  Topics: Collect more L1 Spec Trig's overlap data.

Collected Spec Trig's overlap data starting at 13:28 at channel 13 luminosity
of 12.3   Finished collecting at 15:46 at a channel 13 luminosity of 10.4
Collected 1680 events of which 103 were Spec Trig #31 only.  This is in the
file:  VWork1:SpTrig_Fired_List_1328_22AUG94.Txt
..............................................................................

Date:  19-AUG-1994  At: DZero  Topics: Install 4 Term L15CT, look at 3 more
                                 events that fail L2 XYZ with negative energy.
Move to Using 4 Term L15CT
--------------------------
This move involves changes to: TCC, DSP, 68k_Ser, TRICS_Init_Auxi_L15CT.Dat,
and adding a new file called L15CT_Default_Config.Dat

The known remaining problems with new 4 Term L15CT include: reentranant calls
in the "configuration" files for TCC which cause "failure bad" at Initialize
time.

Because of delays (and sometimes failures) in getting DSP code over the
network if the host system is busy or if the network is busy when TCC is
trying to load up the DSP's; Edmunds remains uncontrollably cranked up to
keep the BLX files on TCC disk.

New features of the 4 Term DSP code include:  programmable count thresholds
for the number of objects required before the Term answers "yes, true".  The
legal range of the count threshold include zero so there no longer needs to
be a "PANIC" mode.

What to push next: more testing of the 4 Term L15CT, test down load and then
an in beam test of 4 rationally defined physics terms.

Look at 3 more events that fail L2 XYZ with negative energy
-----------------------------------------------------------
Evening of 18-AUG-1994 19-AUG-1994  for about 12 hours   3 L2 nodes were
set with break points to stop if negative energy in Cal precision
readout was found.  Three events were collected this way.

file   SCRATCH:[LONG.LEVEL2]D0L241_NEG_EM.DMP          1360 blocks
       DATE and TIME= 19-AUG-1994 00:35:28.15
       LOCAL RUN#        82955
       L1 Spec Trig  8  Fired
       The event looks OK to L1. The event looks OK to L2, there is EM3 energy.
       There are 3 EM hits:  -3,8  2.5 GeV    +2,13  9.0 GeV    +3,13  7.5 GeV

file   D0L242_NEG_EM.DMP;1       507 Blocks
       DATE and TIME= 19-AUG-1994 05:14:11.19
       LOCAL RUN#        82956
       L1 Spec Trig's  19  Fired
       The event looks OK to L1.  No EM3 energy in the precision Cal readout.
       There are 2 EM hits    +3,24  2.75 GeV        +11,10   11.25 GeV

file   Scratch:[Long.Level2]  D0L243_NEG_EM_E_BREAK.DMP       451 blocks
       DATE and TIME= 18-AUG-1994 21:15:23.36
       LOCAL RUN#        82946
       L1 Spec Trig's  8, 16, 26  Fired
       wall of fire  eta -7 -8
..............................................................................

Date:  18-AUG-1994  At: DZero  Topics: C80 information for Jay Wightman,
                               Pull 22 CAT2 cards from M111 and M112, Modify
                               the Trics_Boot file to compensate for the
                               pulled out CAT2's, Start using a new Init_DAC_
                               _Bytes.LSM file,  Try to understand more about
                               the events where L2 sees negative energy,

Last week I promised some C80 stuff for Jay Wightman; I need to do it.

Pull the CAT2
-------------
Pull the CAT2 cards for Tier 1 EM Et and HD Et out of M111 and M112 and pull
out the M111 Tier 2 EM, HD, Px and Py cards.   22 cards total.  The cards
pulled out are:

             M111 Tier 1                            M112 Tier 1
      --------------------------             --------------------------
                     HD  SN# 231                            HD  SN#  73
      Upper Tier 1   EM  SN# 141             Upper Tier 1   EM  SN#  57 **
                     HD  SN# 280                            HD  SN# 253
                     EM  SN# 148                            EM  SN# 132

                     HD  SN# 266                            HD  SN# 233
      Lower Tier 1   EM  SN# 282             Lower Tier 1   EM  SN# 277
                     HD  SN# 246                            HD  SN# 281
                     EM  SN# 237                            EM  SN# 255

             M111 Tier 1
      --------------------------            **  This card is bad.  See the
                    +Py  SN#  40                log book entry from yesterday.
                    -Py  SN# 106                This card appears to be sicker
        Tier 2      +Px  SN#  37                than just not reading back.
                    -Px  SN# 102                It's LED's looked like the
                     HD  SN# 103                correction register had not
                     EM  SN# 133                been loaded correctly.

Modify the TRICS_Boot file
--------------------------
Modify the TRICS_Boot file so that all TRICS access to the above 22  CAT2 cards
will be aimed to the CAT2 in M111 Tier 2 that is the EM Ref Set 3 counter tree
card.  Do this via MOD_HDB commands to TRICS.  Boot TCC and this appears to
be working OK.

New Init_DAC_Bytes.LSM
----------------------
Use yesterdays run of Find-DAC to make a new Init_DAC_Bytes.LSM file and load
it into TCC.  The previous run of Find-DAC was on  1-APRIL-1994 !  Not much
had changed.  Note that it is mostly HD towers in the central eta.  I had
previously noticed that -12,16 was low in the examine plots from global physics
running.         2560 towers have been examined

                 DAC_BYTE low 23 for EM,POS,E_16,P_15 (was 23)

                 DAC_BYTE increment -2 for EM,NEG,E_14,P_7 37->35
                 DAC_BYTE increment -2 for HD,NEG,E_2,P_21 35->33
                 DAC_BYTE increment -2 for HD,NEG,E_1,P_14 34->32
                 DAC_BYTE increment -2 for HD,NEG,E_1,P_20 33->31
                 DAC_BYTE increment -2 for HD,POS,E_1,P_16 43->41
                 DAC_BYTE increment -2 for HD,POS,E_2,P_25 34->32
                 DAC_BYTE increment -2 for HD,POS,E_4,P_18 31->29

                 DAC_BYTE increment  2 for HD,NEG,E_2,P_4  37->39
                 DAC_BYTE increment  2 for HD,NEG,E_2,P_7  30->32
                 DAC_BYTE increment  2 for HD,POS,E_1,P_21 35->37
                 DAC_BYTE increment  2 for HD,POS,E_2,P_10 34->36
                 DAC_BYTE increment  2 for HD,POS,E_2,P_18 36->38

                 DAC_BYTE increment  3 for EM,NEG,E_12,P_16 31->34
                 DAC_BYTE increment  3 for HD,NEG,E_12,P_16 37->40

                    7 tower(s) incremented by -2
                  122 tower(s) incremented by -1
                 2270 tower(s) incremented by 0
                  154 tower(s) incremented by 1
                    5 tower(s) incremented by 2
                    2 tower(s) incremented by 3

Look for cause of the funny negative energy events in L2
--------------------------------------------------------

As the store was near the end between 11:00 and 11:40 setup a special trigger
to look for the funny negative energy events.  Luminosity is varying between
0.5 and 2.0 E30 and the Control Room makes separator scan.

Make EM Ref Set 0 have threshold of 6 GeV at eta -1 only.  Make EM Ref Set 1
have threshold of 2 GeV at eta -14 only.  Require 3 hits in Ref Set 0 and 1
hit in Ref Set 1.  Main Ring is stacking.  There are no other terms in this
Special Spec Trig.  The rate is ZERO.  But both of last weeks funny events
would have passed this trigger.  Has something changed with the funny events?
Move the thresholds around to prove that things are working:

       EM Ref Set 1        EM Ref Set 0
       eta -14 only         eta -1 only        Rate Hz
      --------------      --------------      ---------
           0.25                1.0                50
           2.0                 6.0                 0
           2.0                 1.0                 1
           2.0                 3.0                 0
           0.25                3.0           1 event in 2 minutes
           0.25                2.0               0.5

Try this again at the beginning of the next store.  Luminosity is 10. MR is
off.  Watch for about 2 minutes at 2.0 Gev at -14 and 6.0 GeV at -1 and
see no events.

Later during this store look at the log of error from L2.  Most of them are
from -5,26 which translates into -9 or -10,51 or 52.  There are also a few
from  +2,12 --> 3,24    -5,14 --> -9,28    +2,1 --> 4,1    -6,12 --> -11,23
Jim sets a break point and tries to capture an event.
..............................................................................

Date:  17-AUG-1994  At: MSU  Topics: 15 errors at INIT time, Make a Find-DAC
                                     run,  Fix the problem in the file
                                     TRICS_Init_Auxi_L15CT.dat  that caused
                                     two errors from the MTGBit8 PALs when
                                     the ERPB-MTG was loaded up and started,
                                     Remove the ERPB_MTG_Setup.dat file.
15 Errors at Initialize
-----------------------
These are coming from the EM CAT2 in M112 Upper Tier 1  i.e. eta -20:-17
phi 1:8.   This card is not in use since the eta coverage clip of the Global
Tot Et and the Global Missing Et;  so this does not cause a functional problem
right now.  But it does cause errors at INIT time and this must be fixed.

Find-DAC
--------
Made a run of Find-DAC early this morning.  It is in the file DAC_17AUG94.Log
The run was successful in finding good pedestal values for all 2560 DAC's.
I have not moved these new values to the  Init_DAC_Bytes.LSM  file yet.
I checked it against the current running ped values and very little has
changed.

ERPB-MTG errors
---------------
There had been two errors reported when the  TRICS_Init_Auxi_L15CT.dat
executed the start up of the ERPB-MTG.  These two errors came from ERPB-MTG
channels #7 and #8 when the ERPB-MTG was disabled in preparation for loading
the LCA arrays.  These two ERPB-MTG channels use type 8 MTG Bit PALs.  The
problem was that the value 9 was being loaded into them to set them DC low
(like you would do to a normal MTGBit2 PAL).  But MTGBit8 PALs have the
strange clipped leads so a value of 2 is better to load into them.  A value
of 2 forces there output low but does not disturbe the other internal setup.

ERPB_MTG_Setup.dat  file
------------------------
ERPB_MTG_Setup.dat  is not longer needed so it should be removed from all
of the places where it lives.  First it was copied to   [TrgCur.Obsolete]
and then it was removed from:   D0HTCC::DAU0:[Trigger],  TrgCur: (Fermi and
MSU),  TrgL15CT.Hardware_Software_Text (Fermi and MSU).  This leaves it
in [TrgCur.Obsolete] at Fermi and in [TrgCur.DZero] at MSU.
..............................................................................

Date:  10,11,12-AUG-1994  At: Fermi  Topics: Replace a brick in the Power Pan
                                     for M108 upper Tier 1,  Repair the problem
                                     that L15CT gets bad data from +2,26,  Work
                                     on the L2 zero or not digitized energy
                                     problem,  Clean up and THINGS TO DO.

Replace a brick in the Power Pan for M108 upper Tier 1
------------------------------------------------------
Since Monday afternoon there has been a problem with the -4.5Volt supply for
M108 upper Tier 1.  Monday afternoon there were a couple of alarms from this
supply.  Dan Owen checked it and it showed some funny noise on the scope and
the Fluke meter on AC reads some tens of mV and on DC it jumps around.  When I
checked this supply this morning it was reading about 20 or 30 mV on the AC
scale.  All other supplies read 0.000 Volts on AC.  When you first plug in the
Fluke to this supply it reads 1 or 2 mV on the AC scale, then it starts to ramp
up and then within 10 seconds or so it reads tens of mV.

I pulled PDM-21 out from M108 upper Tier 1 service and replaced its -4.5 Volt
brink.  Pull out brick SN#47 and install brick SN#38.  Then I reinstalled
PDM-21 in M108 upper Tier 1.

Repair the problem that L15CT gets bad data from +2,26
------------------------------------------------------
L15CT sees bit of value 2 as always high in the data from L1 Trigger Tower
+2,26.  We have known about this since 24-MAR-1994.  Dan Owen was also able to
find this problem in the data that they analysis while working on the L15CT
Simulator.  Joan Guida was also able to find this problem while working on
the data from the Turn On Curve run.

Because the power was off in L1 Cal Trig racks today I finally fixed this
problem in the CTFE for +2,26.  This is CTFE SN# 90.  Pin 13 on U3, the 10H124
driver for some Total Et bits was not soldered.  This pin was folded under the
IC.  From the print set it does not appear that this folded under pin can touch
an traces so I just soldered it to its pad.

Work to understand why L2 "xyz" filter has started to see No data or
Negative Energies at eta phi locations pointed to by L1 Jet List
--------------------------------------------------------------------
Uses the L1 Trig D0User to look at events with file name *xyz* in the directory
Scratch:[Long.Level2].   Two of these events had lots of negative energy and
showed may EM towers over threshold in the negative eta.  A summary follows:

    D0L225_L2EM_XYZ.DMP         D0L230_EM_XYZ_POS_81949.DMP
   1-AUG-1994 00:32:59.07         27-JUL-1994 05:04:23.31
   Spec Trig's Fired: 7, 12       Spec Trig's fired: 5
   996 blks                       769 blks
       EM list = 2                    EM list = 0

       eta,phi   GeV                  eta,phi   GeV
       -------  ----                  -------  ----
        -3,22   12.75
        +7,5     3.0

     D0L238_EM_XYZ.DMP          D0L210_XYZ.DMP           D0L230_L2EM_XYZ.DMP
  1-AUG-1994 23:46:38.03    10-AUG-1994 22:07:23.00    1-AUG-1994 00:50:59.79
  Spec Trig's Fired:        Spec Trig's Fired:           Spec Trig's Fired: 7
            8,16,20,21      7,8,10,12,16,20,21,22,26     990 blks
   568 blks                    573 blks
       EM list = 12              EM list =15                  EM list =2

       eta,phi   GeV            eta,phi   GeV               eta,phi   GeV
       -------  ----            -------  ----               -------  ----
       -14,23   2.75            -14,23   4.75               +4,30   13.25
       -11,18   3.5             -11,23   4.0                 +8,12    3.0
       -11,23   3.75            -10,18   5.5
       -10,18   6.75            -10,22   8.5
        -7,22   3.0              -7,22   2.75
        -1,9    3.0              -2,22   4.75
        -1,10   6.75             -2,23   3.5
        -1,11   4.5              -1,9    3.25
        -1,12   4.0              -1,10   7.25
        -1,20   3.75             -1,11   3.25
        -1,22   9.25             -1,12   4.25
        -1,23   6.25             -1,20   7.5
                                 -1,21   4.75
                                 -1,22   14.5
                                 -1,23   12.0
Clean up and THINGS TO DO:
-------------------------
We used L15CT in a special EM Calibration run in its normal mode of throwing
away events.  It was taking about 850 Hz in and passing about 200 Hz.

TRICS V6.2 is running and we have a  TRICS_Init_Auxi_L15CT.dat  that sets up
the ERPB MTG and then starts L15CT running.  Need to fix either the data
value loaded into ERPB-MTG Ch No 7,8 PALs  (BitPAL8)  or else change the
mask in the Hardware Database.  We once saw D0HTCC timeout trying to get a
.BLX file for a DSP.

We had one load failure of L15CT on the night of the 11-AUG-1994.  It was
caused by a DECnet time out as TCC was trying to get a BLX file from the
host.  I expect that there was just a temporary network problem or else
all the L2 nodes were trying to start up at the same time.  Just asking
TCC to load L15CT again and all was OK.  We just should be aware that this
can happen.

Need to give the most recent version of TrgMon to the general users via TrgUser
account.  Need to make a L1 Cal Trig Pedestal run.
..............................................................................

Date:  4,5,6-AUG-1994  At: Fermi  Topics: Test run of L15CT at the end of a
                                        global physics store and during the
                                        scraping of the next,  L15CT ALWAYS
                                        needs to be able to READOUT,  Data for
                                        thinking about overlap.

At the end of the store this morning and during the scraping of the store
this afternoon we ran L15CT as part of the global physics run. It is
connected to L1 Spec Trig #7 i.e. EM_1_Max. We started this morning with
L15CT actually throwing away events. Then for the last 5 minutes of this
morning's store we switched to "PANIC" mode (nothing else changed we just
paused and reloaded 68k_Services). During the scraping this afternoon we ran
L15CT in "PANIC" mode in a global physics run. All was normal in this
afternoon's run except muon was not running due to HV off because of
scraping.

The principal problem discovered was that about 75% of the time when
Spec Trig #7 fires,  some other Spec Trig has also fired.

During the store on the evening of 4-AUG-94 use TrgMon to study what
Spec Trig's are likely to fires at the same time as Spec Trig #7 fires.
Examine 32 events whose  List_of_Spec_Trigs_Fired  includes Spec Trig #7.

Of these 32 events, in 8 of them,  only Spec Trig #7 (EM_1_Max) fired.
Of these 32 events, in 8 of them,  Spec Trig #8 (EM_2_Med) also fired.
Of these 32 events, in 9 of them,  Spec Trig #10 (EM_1_Miss) also fired.
Of these 32 events, in 20 of them, Spec Trig #12 (EM_Jet) also fired.
Of these 32 events, in 3 of them,  Spec Trig #16 (Jet_Multi) also fired.
Of these 32 events, in 5 of them,  Spec Trig #20 (Missing_Et) also fired.
Of these 32 events, in 2 of them,  Spec Trig #21 (Jet_3_Miss) also fired.
Of these 32 events, in 3 of them,  Spec Trig #25 (1Jt_35) also fired.
Of these 32 events, in 7 of them,  Spec Trig #26 (1Jt_Max) also fired.

Thus with this LOW STATISTICS it appears that Spec Trig #12 has the most
overlap with Spec Trig #7.   (i.e. if Spec Trig #7 fires then there is
a 62% change that Spec Trig #12 has also fired).

It also appears that Spec Trig's #8, #10, #26  all have a considerable
overlap with #7   (i.e. any of these three has about a 25% chance of firing
whenever Spec Trig #7 fires).

In these 32 events, there are a total of 65 fired L1 Spec Trigs.   --->
   When Spec Trig#7 fires, on the average, there is one other Spec Trig firing.

During the period that I collected these 32 events the Luminosity was
about 8.5 to 9  E30.   During the time I was collecting, the L1 prescales
were changed from the 11E30 list to the 8E30 list.

Which other Spec Triggers are likely to fire when Spec Trig #7 fires appears
to be strongly dependent on what prescale list is loaded into L1.

FOR L15CT TO BE USEFUL
The meaning of all of this is, that for L15CT to be useful, we need to filter
not just one Spec Trig but a whole set of Spec Trig's.  Muon people learned
this 2 years ago.  We could have learned it from the simulator or by testing
in beam in global physics running early in this project.

Two Management Problems of L15CT
--------------------------------
ALWAYS READOUT
--------------
The way things are managed for the physics runs is that there is one location
where the front-end crates that take part in physics runs are listed.  What I
mean here by physics run is: both the normal global physics run and any of the
"special run" physics runs, and cosmic runs, and all crates test runs,....

We needed to have L15CT included in this list to be part of global physics runs
but all of this implies that the L15CT crate will be readout lots of times
when we were not expecting to be readout,  e.g. times when we expected L15CT
to be parked and in a dead loop.

There are at least two major places where we need to change the way things
are handled:

Right after TCC INIT Time
-------------------------
Right after TCC has been told to INIT the L15CT crate needs to be able to
readout something onto the data cable.  I.E. this is before COOR has ever
talked about L15CT.  For example COOR could INIT TCC and then setup an all
crates test trigger and L15CT crate needs to be able to readout.

When COOR is Finished Using L15CT for Physics Filtering,
COOR Must Return L15CT to a Benign State.
--------------------------------------------------------
At a minimum COOR needs to clear the  Spec-Trig vs L15CT-Terms  memory when
it is finished using L15CT.
Right now when COOR is finished with L15CT (e.g. near the end of a store
when switching to a special run) it does nothing to "de-programmed" L15CT.
At a minimum we need to get the Term_Select P2 "de-programmed" so that L15CT
can just readout IBS events (i.e. no "That's_Me" events).

Solutions
---------
The state of L15CT right after INIT time (i.e. before COOR has ever talked
about L15CT)  is clearly our responsibility.  We can cleanly take care of
this by using the  TRICS_Init_Auxi_L15CT.dat  file to hold commands that will
move L15CT to a running state with all Spec Trig's 0:15 setup to cause IBS
cycles.

The way this divides up makes good sence: the "build it" part of TRICS
brings L15CT to a default halted state, then  TRICS_Init_Auxi_L15CT.dat  bring
L15CT to a running state.  Using TRICS_Init_Auxi_L15CT.dat to come to the
running state gives us easy control over exactly what running state L15CT come
to.

Cleaning up L15CT after COOR has used it for physics filtering, is clearly
the responsibility of COOR.  COOR needs to either give TCC a new special
message (e.g. Release_L15CT)  or else COOR needs to give the series of
currently defined L15CT commands that effectively result in the release
of L15CT.

Running in PANIC mode from the very start of Store 5071 evening of 5-AUG-94
---------------------------------------------------------------------------

Starting from the very beginning of Store 5071 we ran L15CT in PANIC mode on
L1 Spec Trig #7.   The following is TrgMon information from this and the next
run. Each row is the average of 10 sweeps of TrgMon  i.e. a 50 seconds average.

                                                                      Useful
       Chan 13  PreScl  SpTrig  Global Event  Spec Trig  Spec Trig     L15CT
       Lumnsty   List     #7     Transfer to  #7 Firing   #7 L15FW   Filtering
 Time    E30    In Use  PreScl  Level 2 Rate   Rate Hz     % Skip     Rate Hz
-----  -------  ------  ------  ------------  ---------  ---------  ----------
Store 5071
5AUG94
19:20    12.4    15E30     1        124 Hz     38.2 Hz     72.4 %     11.1 Hz
20:06    11.7    15E30     1        108        33.9        74.2        8.7
21:03    11.0    15E30     1         96.8      32.3        72.9        8.8

21:23    10.6    11E30     1        113.6      32.8        69.3        9.9
23:10     9.3    11E30     1         93.2      28.2        71.3        8.3

6AUG94
 7:40    6.14     6E30     1        112.5      17.0        69.2        5.2
10:45    5.4      6E30     1        100.2      16.0        71.0        4.6

Store 5073
7AUG94
00:15    13.0    15E30     1        125.2      39.8        72.4       11.1


SpTrg_Fired_List_1000_6AUG94.txt   one hour of data centered around 10AM
                                   Channel 13 Luminosity averaged 5.6 E30
                                   PreScale List 6E30

SpTrg_Fired_List_0030_7AUG94.txt   1.2 hours of data centered around 00:30 AM
                                   Channel 13 Luminosity  13.2 at start of file
                                                          12.6 at 1/2 way point
                                                          12.0 at end of file
                                   PreScale List 15E30
..............................................................................

Date: 26,27,28-JUL-1994 At: Fermi Topics:  Re-Install the "A" Hydra-II, Look at
                                        the problem of why cann't we Load the
                                        LCA's from the ERPB_MTG_Setup.dat file,
                                        replace Power Pan MM4 with MM3,
                                        Turn-On Curve run, new TRICS and 68k-
                                        Services code,  TRICS_Init_Auxi_L15CT
                                        file,  Boot instructions for L15CT.

Reinstalled the "A" Hydra-II.  Installed MSU SN#1 as the running "A" Hydra-II.
MSU SN#5 is stored here at D-Zero as a spare.  It is in a white box in the
bottom of the spare cards rack.  It has all of its expander and paddle cards
installed and Steve has checked it just before shipping from MSU to D-Zero.

Loading LCA's
I connected the Logic Analyzer to the ERPB MTG lines that cause Loading of the
LCA's.  Running the ERPB_MTG_Setup.dat file caused nothing to happen.  I do not
know how or why I saw the yellow lights flash a couple of weeks ago.  Was it
a different ERPB_MTG_Setup.dat file ?  Was it caused by the different (broken)
ERPB MTG card ?   Anyway it takes some special running of the MTG to get it
to single cycle scan the PROM Adress range 1000 to 1600  i.e. mostly upper
bank.  I updated the ERPB_MTG_Setup.dat file and distributed it to all four
places.  The logic analyzer now shows good signals.  I have not had power off
to the L1 racks so I do not know if it actually Loads the LCA's.

Made the Turn-On-Curve run for L15CT  on 27-JULY-1994.   After that we
switched to the V6.1 version of TRICS and the latest 68k_Services.  These
together give an on line look into what L15CT is doing.  These new versions of
TRICS and 68k_Services have been left running.

28-JUL-1994
Called at 7 this morning because a L1 Trig Alarm came in.  The Alarm actually
came in at about 3:20 but they let me sleep until close to shot setup time.
The shot setup failed when they lost the stack at the start of shot setup.
The Power Pan in M112 that services the M111 Tier 2 had its breaker trip.
When I turned it back on all bricks came up OK except for the -4.5 brick.
I pulled Power Pan MM4 and installed MM3.   MM4 has a bad -4.5 brick.  The
bad brick is SN#30 7-FEB-1992.  While the power was off I pulled the die down
off of the M102 air flow sensor (see last weeks log entry).

Load LCA's and ERPB_MTG_Setup.Dat
---------------------------------
When turning the racks back on after replacing the Power Pan I checked to
see if we could load the LCA's.  At power up 9 of the 10 racks had yellow
lights on.  Only M109 had its yellow lights off.  After running the
ERPB_MTG_Setup.dat  all 10 racks had the yellow lights off.  Load LCA works.

Recall that we are NOT currently using  a TRICS_Init_Auxi_L15CT.dat  file
so if power has been off then it IS still necessary to manually execute the
ERPB_MTG_Setup.dat file.

L15CT Boot Instructions
-----------------------
I have written more "Boot Instructions" for the L15CT 68k and a description of
the characters desplayed on the 68k_Services terminal. I have modified the
boot instructions from what Steve got started a couple of weeks ago. They
explain where the new VME Reset button is and emphasize, "Why do you think
that you need to do this boot". They also talk about turning power ON or OFF
to the L15CT. I have put labels on more stuff in the MCH e.g. rack M124, the
terminal... I gave a copy of these instructions to Joan Guida.  I have not
made them public yet.
..............................................................................

Date: 25-JUL-1994      At: MSU    Topics: Work on broken Hydra-II and also
                                          prep and test a spare Hydra-II
                                          for FNAL.

Steve replaced the Global SRAM for DSP #3 on Hydra-II MSU S/N #4 (which had
been DSP-A at FNAL until last week, see the 21..23-JUL-1994 entry).  This
appears to have fixed the "Sanity and Configuration Checker's" displayed
error.  But we never saw the "SCC never try to boot DSP #3" error which
Dan saw last week.  Steve looked for this error both before and after
replacing the GSRAM.  So for now, let's keep this card at MSU.  We will
use the recently-repaired Hydra-II MSU S/N #1 (with TCPE S/N #4, TPPB S/N
#4, and SPPB S/N #2) as the DSP-A card at FNAL.

Steve removed the TCPE and TPPB (both S/N #3) from Hydra-II MSU S/N #4 and
installed them on Hydra-II MSU S/N #5.  A new SPPB (S/N #8) was built for
Hydra-II #5.  Recall that the TCPE and TPPB are supposed to be interchangeable,
but the SPPB must always be "custom-built" for its associated Hydra.  The
Hydra-II #5 is going to FNAL as the fully-assembled, long-term support spare.
Hydra-II #4 is staying at MSU, with only its SPPB.  This is the only Hydra
which will remain at MSU.
..............................................................................

Date: 21,22,23-JUL-1994  At: Fermi  Topics:  Clip the Global Missing Et and
                                    the EM Et and the HD Et at Eta 3.2,
                                    modify Trics_Init_Auxi as part of doing
                                    this,  Test run with both L15CT and L15
                                    Muon running and rejecting events,   Work
                                    on the mismatch problem between L0 Bunch
                                    Number and L1 Bunch Number,  Boot Rev 6.1
                                    TCC code along with the newest 68k_Service
                                    code,  Work with L15CT

----->  There is an Air Flow Sensor left tied down in the L1 Racks.  <-----
----->  Clean this up next trip or next time that power is off.      <-----

Bunch Number Mismatch
---------------------
In L2 Filter Code the L1 Bunch Number is determined by reading the 6
Bunch_P_Gate And-Or Term DZero Note 967 Item Number 381.  This is from the
IMLRO Card in the upper And-Or cardfile in rack M101 i.e. And-Or Terms 0:127
Spec Trigs 0:15.  Since "for ever" there has been a low level of mismatch
between L0 Bunch Number and the L1 Bunch Number e.g. one or two error per
physics run.   Early this week this climbed to 1 per 1000 events transfered
to L2.

Rich Partridge and Jeff Bantly showed by complicated tests that L0 was most
likely OK and  after they  learned about the  second copy of  the And-Or Terms
(i.e. for Spec Trig's 16:31) they were  able to show that the Spec Trig's 0:15
readout of the And-Or Terms did not match the Spec Trig's 31:16 readout of the
And-Or Terms in the  region of the Bunch_P_Gates  for those events where there
was a L0 to L1  mismatch and  that it looked  like L0 matched  the Spec Trig's
31:16 version.

I added a check of the Bunch_P_Gate And-Or Terms to the VTC Code.  This new
part of the VTC Error_Checking routine verifies that:

   The Spec Trig's 0:15 version of the Bunch_P_Gate Terms matches the
   Spec Trig's 31:16 version,
   That the Current Bunch_P_Gate equals the Previous Bunch_P_Gate +1  Mod 6,
   and that there is one and only one Bunch_P_Gate And-Or Term active.

This new error checking routine immediately found errors when we ran the
normal test trigger.    OK We now have a tool to see the problem.

In M101 pull IML SN#15 (was And-Or Terms 0:127 Spec Trig's 0:15) and replace
it with SN# 17 which has had all of its 10H101's replaced with 10101's.
In M102 pull IML SN#10 (was And-Or Terms 0:127 Spec Trig's 31:16) and replace
it with SN# 12 which has had all of its 10H101's replaced with 10101's.
This did NOT help but I left the new Non 10H101 cards installed.

In M101 pull IMLRO SN# 17 and replace with SN#22  (And-Or Terms 0:127 Spec
Trig's 15:0).  This did NOT help so the original IMLRO SN# 17 was put back
into M101.

Each time  after working  on this  system I  Noticed that there  were never
any errors when the system was first turned on and was cold and L1CT was
mostly still off.  It takes L1CT running for about 5 minutes to warm things
up before we start to see the errors.

Pulled the M101 upper And-Or cardfile (And-Or Terms 127:0 Spec Trig's 15:0)
MBD card (SN# MBD-019).  Replaces all 10H101's with 10101's and replaced the
10H109 with a 10109.   This did NOT fix the problem.

Perhaps it is a Front CBus problem on this And-Or Cell.  Loaded thing up to
get a test trigger running, then replaced the Front CBus with a 3 connector
cable to pickup just the MBD, IML, and IMLRO.   This did NOT help or
make things worst.

Perhaps it is a problem on the backplane in the range of And-Or Term #95
through #127.   I put on an extra terminator on the backplane to screw up
the signals a little.   This did NOT help or make things worst.

Perhaps it is something on the MBD besides the 10H101's.   I put an extra
terminator on the back CBus connector of the MBD.  This did NOT help or
make things worse.

Perhaps it is something on the MBD besides the 10H101's.   I replace the MBD
SN# MBD-019 with the one spare MBD at Fermi  SN# ???.  This appeared to help
a little for a while and then things got bad again.

Well this looks like a flakey error  i.e. its good when first turned on
and then gets bad.  Well may be the timing is too tight in the CBus cycles
of the Data Block Builder.   Currently we are running 7 CBus cycles per
beam crossing.   I make a new Framework main timing MTG PROM #3.   This
is called Revision N.   This has 6 CBus data block builder cycles per beam
crossing.   With this installed there are no more errors.
Or is it just that the old PROM is bad?

One other clue.  The errors picked up by the error check in VTC never said
that an illegal set of Bunch_P_Gate And-Or Terms were active (e.g. zero
And-Or Terms or more than one And_Or Term)  ---> The problem can not be random
data on the CBus data lines or random address on the CBus address line.  The
problem must be the address line that selects between Current and Previous.

This Rev. N PROM #3 for Framework MTG keeps the positive going edge of the
COMINT Clock at tick 11 at the same place as Rev M.   I.E.  I think (and hope)
that it is only this positive edge of COMINT Clock that needs to be synchronous
with other L1 FW activity.  NEED TO VERIFY THIS

The new TSF file is   Framework_MTG_PROM_3_SN_3N.TSF.   The old version was M.

Note when working with a simple prescale only L1 trigger,  if you want to see
all of the different bunches then the prescale must not divide by 6 and also
the prescale x 2, ... the prescale x 5   all must not divide by 6.   12113
and 24113 are good numbers.

Work with L15CT
---------------
Well there were many power cycles of L1 racks so I got to play running the
ERPB_MTG_Setup.DAT file a bunch to see if it can load the LCA's.
      ---> It does NOT appear to work. <---
The only thing that works is pushing the button on the distributor caps.

It looks like Hydra A is dead.   Sometimes it hangs saying it is waking up
DSP #4 and sometimes it get to DSP #3 and says that there is a memory error

   DSP #3 at 0x c0002022     wrote 0x 0      read 0x 1000

This was all working Thursday night when we ran L15CT and L15 muon at the
sametime with both rejecting events with out any problem.
I even tried power cycling the L15CT crate and this did not help.

Pull the "A" Hydra-II card.  It is SN#7047  MSU #4.

When pulling out Hydra-II "A" I did it from the left i.e. pulling out the
IRONICs cards first.  Before I pulled the twist and flat cables off of the
front of the HYDRA I added more information to the labels.  I added the
following to help quickly get them back in the right spots:

              e.g.        A 1 L
                          | | |
         A, B, or C ------+ | +------- L, M, or R
                      1:6 --+

This is Hydra-II  A, B, or C      Connector 1:6 (1 is the top connector)
L Left,  M Middle,  R Right  row of connectors when viewed from the front.

On Thursday night (when L15CT was still working) we made runs with both L15
muon and L15CT running and rejecting events.  This worked OK.

Clip Global Missing Et and Global Total Et to |ETA| <= 3.2
----------------------------------------------------------
To clip the Global Missing Et and Global Total Et coverage, I pulled out the
M111  Tier 2  CAT2 output cables from the  Px, Py, EM Et, and HD Et  CAT2's
cards.

Trics_Init_Auxi.DAT was then changed to ever write the Correction Registers at
Tier 3.   Note Trics_Init_Auxi already had commented out code in it to do this.

This commented out code in Trics_Init_Auxi was exactly equal to code in the
file TrgCur: Tree_Offset_eta_16.dat.   But I think that both of them had wrong
values for the HD 2nd lookup correction registers.   I fixed the HD 2nd lookup
correction register values in Trics_Init_Auxi.dat but I did not change
Tree_Offset_eta_16.dat.   Perhaps these old values of HD 2nd lookup correction
reg values were proper for the Run 1A HD PROM's.

If we are going to run this way then I would like to remove the M111  Tier 2
Px, Py, HD Et, and EM Et card and the M111 and M112  Tier 1  HD Et and EM Et
cards.  A total of 18 CAT2 cards.  This is a significant amount of power.
---> TRICS would need to be changed so that it does not INIT these cards. <---

Tried TRICS V6.1 and latest 68k_Services
----------------------------------------
TRICS V6.1 appears that it may have two problems:

Something is funny at Boot and INIT time that causes it sometimes (perhaps
30%) to get stuck in a funny mode where all L1 Data Block appear to be over
written or the pipe control is wrong or it does not resync the pipes or
something like that.

It appears that it may have the wrong base address of the status blocks
or the wrong status block organization when reading status from the DSP's.
..............................................................................

Date: 19-JUL-1994      At: MSU    Topics: Look at VSB Mastership negotiation

We wanted to understand whether the MVME135 CPU can acquire VSB mastership
from the Hydra-II while the Hydra-II is performing DMA access of the VSB
memory at full speed.  In order to do this test at MSU, we set up the VSB
bus masters in the "normal" way (i.e. the way they are at FNAL:  135 is
Crate Controller and can request Mastership, Hydra-II can only request
Mastership).  We then set up a maximum-length DMA list in the Hydra-II,
transferring from a fixed on-chip SRAM location to a fixed VSB address.
We started the DMA list, and then used 135BUG to also access VSB memory
(via the MD/MM commands).

The Hydra-II DID transfer VSB mastership to the 135 in this situation.
That is, the Hydra-II will not "hog" the VSB bus during DMA accesses,
but it will instead release the bus when requested.

The test was also performed using the C40 CPU to access the VSB memory.
In this case, the RPTS instruction was used to perform zero-overhead
looping on a STI instruction which pointed into VSB memory.  Again, the
results were the same.

Also note that the time required to perform each VSB access was the same
between CPU-initiated cycles and DMA-initiated cycles.  In each case, a
complete access required 700 ns.  This was determined by running either
the DMA engine or the RPTS instruction for a fixed amount of time and
then determining how many transfers were performed by looking either at
the Transfer Counter or the Repeat Counter.
..............................................................................

Date: 14-JUL-1994      At: MSU    Topics: Edit Trics_Boot_Auxi.dat and
                                          TrgMon_FS.RCP to implement the
                                          renamed scalers for Active MR Veto.

DZero is making permanent the change over to Active MR Veto running.  We
needed to rename a number of Foreign Scalers in the  Trics_Boot_Auxi.dat  file
to implement this.  The old file was put in [TrgCur.Archive] at Fermi   and
the new file is on TCC's disk, in TrgCur:, and copied to MSU::[TrgCur.DZero].

Renamed the same set of Foreign Scalers in the  TrgMon_FS.RCP  file.  This
new file is in HTrgMon: at Fermi, in the TrgUser account at Fermi and copied
to MSU::HTrgMon:

The following is the new names of these scalers in the Luminosity files:

  NIM to ECL      Pair on
  Module Lemo      the 17
  Connector      Pair Cable        What signal is it.   Where does it go.
 -------------  -----------  -------------------------------------------------

14th from top        4       This scaler was:  BX_Counts_of_MR_Veto_High
                             It becomes:       BX_Counts_of_MRBS_and_MicroBl
                             This is:  Foreign Scaler #31 Gate A
                                       DBSC Ch #2 in slot 12 CA=35

 15th from top        3      This scaler was:  BX_Cnts_of_MR_Veto_High_or_Low
                             It becomes:       BX_Cnts_MRBS_and_uB_or_MR_Low
                             This is:  Foreign Scaler #30 Gate A
                                       DBSC Ch #3 in slot 12 CA=35

 10th from top        8      This scaler was:  BX_Cnts_MR_Hi_or_uB_or_Mu_HV
                             It becomes:       BX_Cnts_of_MicroBlank_or_Mu_HV
                             This is:  Foreign Scaler #35 Gate A
                                       DBSC Ch #2 in slot 11 CA=32

 11th from top        7      This scaler was:  BX_Cnts_MR_Hi_or_Low_or_Mu_HV
                             It becomes:       BX_MRBS_and_uB_or_MR_Low_or_MuHV
                             This is:  Foreign Scaler #34 Gate A
                                       DBSC Ch #3 in slot 11 CA=32
..............................................................................

Date: 12-JUL-1994      At: MSU    Topics: TCC Problem at Fermi

TCC "hung".  Tried to look at the directory on TCC disk.
    $ dir d0htcc::dua0:[trigger]
    %DIRECT-E-OPENIN, error opening D0HTCC::DUA0:[TRIGGER]*.*;* as input
    -RMS-F-NET, network operation failed at remote node; DAP code = 01F77C54
    $ show time
      12-JUL-1994 13:40:59
I do not know if there were other problems (e.g. network or L2 nodes) at the
same time.
..............................................................................

Date:  8-JUL-1994       At: Fermi       Topics: Take snapshots of L1.5 Cal
                                                Trig without beam

We set up the Logic Analyzer to monitor the L1.5 Cal Trig.  We are taking
"snapshots" of the "big 5" L1.5 Cal Trig operation modes.  We want to really
nail down the details of time usage in L1.5 Cal Trig.  The setup of the
Logic Analyzer that was used is stored on Logic Analyzer Disk #6 in the
file SNAPSTUP.C15.  The Logic Analyzer was set up as follows:

    Pod #1: TTL
    -----------
    Signal on pod       Signal name             Signal source
    -------------       -----------             -------------
    0: hld_tran         Hold Transfer           Path Select 4th from top
    1: some_hap         Something Happened      Path Select 5th from top
    2: thats_me         That's Me               Path Select top
    3: vme_a20          VME A20                 VME Bus #1 C18
    4: vme_a21          VME A21                 VME Bus #1 C17
    5: vme_a22          VME A22                 VME Bus #1 C16
    6: vme_a23          VME A23                 VME Bus #1 C15
    7: vme_/wrt         VME *WRITE              VME Bus #1 A14
    8: vme_/as          VME *AS                 VME Bus #1 A18
    9: po_ans_0         "Port" Ans for Term 0   Term Answer top
   10: ff_don_0         "FF" Done for Term 0    Term Answer 5th from top
   11: vsb_bfsl         VSB Buffer Select       tapped from VSB Buffer Sel.
   12: not used
   13: not used
   14: not used
   15: not used

    Pod #2: ECL (variable threshold = -1.4V)
    ----------------------------------------
    Signal on pod       Signal name             Signal source
    -------------       -----------             -------------
    0: strt_dgt         Start Digitize          single-signal cable
    1: xmt_trig         Transmit Trigger        "Slave" MTG pins 21-22

We collected the following files:

    SNPBNSTB.C15, SNPBNST2.C15:     "N" with all signals
    SNPBNNOS.C15, SNPBNNO2.C15:     "N" with no VME /WRT, no VME /AS

    SNPLNSTB.C15, SNPLNST2.C15:     "n" with all signals
    SNPLNNOS.C15:                   "n" with no VME /WRT, no VME /AS

    SNPBFSTB.C15, SNPBFST2.C15:     "F" with all signals
    SNPBFNOS.C15, SNPBFNO2.C15:     "F" with no VME A22, no VME /AS

    SNPBISTB.C15, SNPBIST2.C15:     "I" with all signals

    SNPLISTB.C15, SNPLIST2.C15:     "i" with all signals

These files will need to be carefully dissected at MSU.  We performed
simple sanity checks on these files at Fermilab.  We see that the L1.5
Cal Trig requires 120-123 us between Start Digitize and returning DONEs
to M103 L1.5 Framework.  Note that these snapshots were taken running at
0.57 Hz so there is no overlap between triggers.  The 68K Services CPU
spends a lot of time waiting for Global to be at D3 in this configuration.
With more overlap between triggers, the 68K will not be so quick to start
looping on the "check Global for D3" loop.

We removed the Logic Analyzer and returned the L1.5 Cal Trig to a standard
clean configuration.  We ran at 570 Hz (with a Mark and Force Pass ratio
of 1919) for 25 minutes with 0 errors.  Note that, running at 570 Hz,
every event following a Mark and Force Pass event is accepted via L1.5
Framework Timeout.  This is the known behavior, the L1.5 Cal Trig is not
able to service this event within the L1.5 Framework Timeout timeframe, so
the L1.5 Framework times this event out.

Note that, in order to make "i" events, we typically change the programming
of the L1.5 Trigger Framework.  The Specific Trigger that is being used
to make "i" events must be a L1.5-Type Specific Trigger.  The programming
of the L1.5 Trigger Framework Veto/Confirm MTG must be changed to force
the Specific Trigger to be Vetoed.  The Veto/Confirm MTG is CBUS=2, MBA=57,
CA=19.  The value "1" needs to be programmed into two Function Addresses
in this card in order to force the Specific Trigger to be Vetoed.  These
Function Addresses are:  the same as the Specific Trigger to be Vetoed, and
16+the Specific Trigger to be Vetoed.  e.g. to force Specific Trigger 2 to
be Vetoed, put a "1" into both FA=2, and FA=18.  To return the L1.5 Trigger
Framework to its normal operation, program the value "254" into the same
two Function Addresses.

..............................................................................
Date:  7-JUL-1994       At: Fermi       Topics: Take Dan Owen special run
                                                during beam

We took the "Dan Owen/Mike Tartaglia" special run during beam.  Some TRGMON
screen captures and the programming of the L1.5 Cal Trig are stored in the
file VWORK1:TRGMON_DUMP.TXT_L15CT_BEAM_7JUL94.  A summary of this run is
included below:

 Approximate Luminosity = 2.3 E 30

 MFP Ratio = 5
 Actual rejection rate estimated at 86% by looking at 68K_Services tube
 TRGMON-displayed rejection rate estimated at (4/5)*Actual_rate = 69%
 TRGMON-displayed rejection rate: 68%-71%

 Dead Beam Crossings due to L1.5 approximately 0.81% @ 76 Hz, 1.1% @ 100 Hz
 Geo Sect 5 Front End Busy approximately       1.1% @ 78 Hz in/ 24 Hz transfer

 TRGMON rates:      Specific Trigger #1 ("That's Me")  = 100 Hz
                    Specific Trigger #2 ("IBS")        =   2 Hz

 Approximate rates from 68K_Services screen:

    I =  3/80 * 102 Hz =  3.8 Hz
    F = 16/80 * 102 Hz = 20.4 Hz
    N =  6/80 * 102 Hz =  7.7 Hz
    n = 55/80 * 102 Hz = 70.1 Hz

 Rough deadtime calculation:  @100 Hz, 1.1% L1.5 Dead X = 110 us/L1.5 Cycle

        4/5 of these cycles take the "normal" amount of time, about 130 us
        1/5 of these cycles take the "short" amount of time, about 30 us

        4/5 * 130 + 1/5 * 30 = 110 us/L1.5 Cycle average

The configuration for this special run was screwed up, however.  The Level
1 Reference Set was only defined out to Trigger Tower eta index +/-6.
Therefore, all of the found objects were on Hydra-B.  This run should be
re-done.  This can (and should) be done without manual Dan/Steve intervention.

Steve started a set of booting instructions for the L1.5 Cal Trig 68K
Services computer.  We really need to install a nice pushbutton on the
VME RESET* for the L1.5 Cal Trig VME Crate.  This crate will need to be
manually reset occasionally and it is scary to imagine random people trying
to press the MVME135 RESET button.  Note that the Slave Vertical
Interconnect's RESET pushbutton cannot be easily tied up to the VME RESET*
signal (i.e. there isn't a jumper on the Slave VI to do this).

The VBD began exhibiting its old "VBD reset required after VME RESET" feature.
Recall that feature results in the Sequencer console scrolling Token Loop
Count errors for Crate 51.  Did this problem ever really go away?  Pushing VBD
reset is included in the 68K Services booting instructions.

..............................................................................

Date:  6-JUL-1994       At: Fermi       Topics: modify Path Select P2, install
                                                ERPBs/DC for Rack M112, change
                                                ERPB MTG setup, replace ERPB
                                                MTG PCB, new "Panic" mode idea,
                                                Replaced the Tier 2 -Px CAT 2
                                                card in M105, Problem at Init
                                                time after having power off,
                                                Ran CalTrig_Random

We modified the Path Select P2 card at Fermilab to produce an ECL output of
the "That's Me" signal.  Also, "internal" modifications were made to the card--
the "Hold Transfer" signal is now buffered to reduce the load on the inter-P2
bus, and the Hold Transfer delay is now fixed at 300 ns.  We added some TTL
test points to the card.  Looking at the card from the back of M124, the
test points are:

        (top)

    That's Me                   X X     GROUND
    Delayed Hold Transfer       X X     GROUND
    >= 1 Term to be evaluated   X X     GROUND
    Hold Transfer               X X     GROUND
    Something Happened          X X     GROUND
    (open)                      X X     GROUND

We also modified the "chain" of MTG Master Channels.  The current arrangement
is:

    "That's Me"             is EXTBIT for MTG Channel 8

    MTG Channel 8 BITOUT    is EXTBIT for MTG Channel 3  (Store_Enable_Bar)
               and inverted is EXTBIT for MTG Channel 4  (Latch_Enable_Bar)
                        and is EXTBIT for MTG Channel 7

    MTG Channel 7 BITOUT    is EXTENB for MTG Channel 5  (Transmit_Trigger)
                            (Channel 5 now has a BIT2 PAL rather than a BIT8)

We cleaned up this wiring somewhat on the patch panel, but we have not done
a 100% final installation of this wiring, because this wiring will change
when we start double-buffering the ERPBs.  The final installation of this
wiring should actually be done on the "jumper block" at the front of this
MTG.

We installed the ERPBTG1C PROM in the ERPB MTG.  This PROM changed the timing
for Channel 5 to be a 1us pulse, up at 45 and down at 72.  This is appropriate
for the BIT2 PAL which is now in Channel 5.  The Transmit Trigger will now
be a 1 us pulse rather than the previous 3.5 us pulse.  This change was not
strictly necessary (i.e. we didn't see any problems which we could trace to
the old PAL) but BIT2 PALs are easier to think about.

We edited the ERPB_MTG_SETUP.DAT file to account for the change to a BIT2
PAL in Channel 5.

Dan installed ERPBs and the DC in Rack M112.  This completes the ERPB/DC
installation.  The Serial Numbers are (in descending order, starting with
the DC at the top of the rack):  DC-6; ERPB-88; ERPB-89; ERPB-90; ERPB-91;
ERPB-92; ERPB-93, ERPB-95; ERPB-94.  We looked at the data from these
ERPBs to verify that the DSPs were seeing 128 bytes per Comm Port, and that
no dangerous bits were stuck on.

It required about 4 hours of all power off to install these last ERPB's and
route their cables in through the cable clamp in M111.  When power was
turned back on after 4 hours and we tried to initialize the system there
was an error.

      S-INI/ODB%COORini% Initializing all Specific Triggers
      E-HIO/HDB%COORini% Failure Writing 240 @ cbus 2 mba  57 ca 34
                                                     fa  27 read 248
      E-HST/ODB%COORini% Failure Programming Spec Trigger #11 Requiring
                                                     Level 1.5 Term #12
      E-INI/ODB%COORini% Failure Initializing Spec Trig #11 Requiring
                                                       of L1.5 Term #12
             .              .        .         .
             .              .        .         .
      E-HIO/HDB%COORini% Failure Writing 240 @ cbus 2 mba  57 ca 34
                                                     fa  27 read 248
      E-HST/ODB%COORini% Failure Programming Spec Trigger #11 Requiring
                                                     Level 1.5 Term #15
      E-INI/ODB%COORini% Failure Initializing Spec Trig #11 Requiring
                                                       of L1.5 Term #15
      E-INI/ODB%COORini% Failure Initializing Spec Trig #11
      E-INI/ODB%COORini% Spec Trig Initialization Failure Count Is 1

I believe that this is the "standard" DigiMem card error that we sometimes see
on an Initialize after a long power down.  This error lasted for about 4
initializes over a period of perhaps 10 mimutes (right after power up and then
this problem went away (as it has before.

All of "Level-1" (L1 FW, L1 CT, L1.5 FW, L1.5 CT and L1.5 CT MTG including
the fan) was powered off for this installation.  After powering back up, we
noticed yellow ERPB LEDs turned on in Racks M107, M108, M112.  We left the
ERPBs in this condition for approximately 1 hour, and then turned on the ERPB
MTG.  When we looked at the ERPB LEDs, all of the yellow LEDs were turned off.
The next time we have the L1 Cal Trig turned off, we should do the following:

    1) check yellow LEDs on ERPBs...DO NOT push the pushbutton on the DCs
    2) if any are turned off, wait a few minutes and check again.  We want
       to see if they "spontaneously" load themselves.
    3) if the ERPB MTG is turned off, look at the LEDs both before and after
       turning the ERPB MTG back on.  We want to see if they load themselves
       based on random noise from the MTG as it is turned on
    4) last of all, run the ERPB_MTG_SETUP.DAT file.  Look at the LEDs both
       before and after running this .DAT file.  We want to prove that this
       file actually loads the ERPBs if they haven't already been loaded (we
       believe that it can successfully re-load the ERPBs).

We worked on the 1-in-30000 hang error (i.e. the chronic error that we have
seen where no ERPB data is transmitted to the DSPs).  We started by loading
the "hang on error" version of the 68K_Services code.  We clipped the Logic
Analyzer on the TTL copy of the "Something Happened" and "That's Me" signals
which were added to the Path Select P2, as well as using the differential-to-
single ended ECL convertor box to monitor one of the "Transmit Trigger" slave
copy signals.  We set the Logic Analyzer to never trigger, and used a -1.4V
threshold for the single-ended "Transmit Trigger."

After the first hang, our suspicion that a "That's Me" occured without an
associated "Transmit Trigger" (for the "hang" event) was confirmed.  We then
started looking at the "chain" (i.e. Channel 8 to Channel 7 to Channel 5) which
is used to convert "That's Me" into "Transmit Trigger."  The output of Channel
8 (which should be a single 3.5 us pulse starting about 400 ns after the rising
edge of "That's Me") was only a 2.275 us pulse for a "hang" event.  I.e. it
dropped at about tick 65.  This 2.275 us timing was repeatable (not random).
Note that this signal is "picked up" by Channel 7 at about tick 75, i.e. it
was completely missed by Channel 7 for the "hang" event.

After trying a few things (i.e. replacing the BIT8 PAL in Channel 8 with a new
BIT8 PAL, verifying that the Accelerator Clock and Turn Marker going to this
MTG were OK, and noting that we had already replaced the PROM for this bank),
we decided that the problem was somewhere in the MTG PCB, likely in the clock
generation section of the board.  The soldering on this board did not look very
good--we couldn't see an obvious joint to re-touch, instead we saw a lot of
iffy joints.  We moved all of the components from the "old" ERPB MTG (S/N 24)
to the new ERPB MTG (S/N 21--this card was once the L1.5 Framework Control
MTG; it was removed during troubleshooting but is actually thought to be 100%
OK).  After re-installing, we ran at about 50 Hz (with an additional 170 Hz
of L1 rejects and 30-40 Hz of L1 accepts caused by other people running) for
about 30 minutes with no errors.

Another thing that we need to do with the Logic Analyzer is collect snapshots
of the normal running of the L1.5 Cal Trig.  We should capture the following
signals:

        - Front-End Busy
        - Something Happened
        - That's Me
        - Transmit Trigger
        - Answers/Dones to L1.5 Framework (or at least a time marker)
        - Start Digitize and Hold Transfer

We should get a snapshot of each of the following situations:

        - "N", "n", "I", "i", "F"

This is probably most simply done with no beam.  It might also be nice to
capture a normal mix of events during actual running.

We also thought of two different software-only ways to do the "Panic" mode
operation.  We could either let the 68k_Services CPU immediately say "yes"
to the L1.5 Trigger Framework (at a cost of about 20us deadtime, this is
what the 68K_Services code currently does for a Mark and Force Pass event),
or we could let the 68K_Services CPU wait until the DSPs respond, and then
say "yes" (which would give the correct deadtime, about 130 us from "That's
Me" to clearing the Front-End Busy).  The first mode has the advantage of
reducing deadtime (but note that it will give an unrealistic view of the real
deadtime), while the second mode has the advantage of showing the real dead
time (but also imposing this deadtime on the experiment).  We want to avoid
changing hardware to operate in "Panic" mode.

Tier 2 -Px CAT2 replacement.  Replaced the CAT 2 in slot 24 of the Tier 2
in rack M105.  This is the -Px adder.  Pulled CAT2 SN#90 and replaced it
with CAT2 SN#69.  These are Tier 2 ECO'ed CAT2's.  This is to repair the
problem of sometimes being off by 8 counts in the Px sum.  Philippe traced
this to Px from eta -5:-8 phi 17:24 which is operand #8 on this CAT2 Card.
See entries in this log from 30-June-94, 25-MAR-94,  16-FEB-94 for more
background on this problem.

Run CalTrig_Random.  After replacing this Tier 2 -Px CAT2 card we ran
325k loops of CalTrig_Random test.  This required only 45 minutes to run.
We had only 1 error during this run  (EM Tower Count Ref_0 is 273 instead of
275).  There were no momentum sum errors.

..............................................................................

Date: 30-JUN-1994      At: Fermi  Topics:
                                           Investigate MPt discrepancy (Jill P)

Jill Perkins et al (Nikos, Andrezj Zieminsky) compare the Level 1 Trigger AndOr
Terms to the simulator results. They have a number of discrepancies that need to
be investigated in detail, after the L1.5 CT is delivered. Some are on terms
actually used, others are on unused comparators, etc.  One worrisome error is on
MPt comparators, at the 10 % level. All 3 programmed thresholds show problems,
with rates decreasing with increasing threshold values. This could be a failure
of the Px/Py trees to come up with the correct sum, or of the FMLN to do the
comparison, or hold its memory content, or...

To investigate this problem, we run CalTrig Random Tests. And the symptoms all
point to the old problem with Px at (-5:8,17:24), cf entries from 16-FEB and 25
MAR. But the problem does not go away this time after pushing on the Tier #2 +Px
card. It doesn't go away after reseating the Tier #1 Px card and connectors
either.

The next step is to replace the Tier #2 card, trying to look at the backplane
pins with the flash light. This isn't done today because we are ready to leave,
and the chances of much beam in the close future are low. Also the error is only
of 8 counts = 4 GeV of Px.

After failing to solve the problem, we ran successfully 900,000 loops of full
random tests on all etas, and 1/2 the phis (1:16).
..............................................................................

30-JUN-1994

Make tests of the trigger rate vs FEBz% and BX_Lost_to_L15% for the various
types of L15CT cycles.

First look at IBS events where all are CONFIRMED by the L15 Framework.  "I"

          Hz         FEBz%    from Geographic Section 5
        ------   ------------------------------------------------
          57       0.1%
          71       0.1%
          95       3/4 of time 0.1%,  1/4 of time 0.2%
         114       0.2%
      Now turn off sending to the Host to go faster
         286       0.4% 3/4 of the time,  0.5% 1/4 of the time
         500       0.9% 3/4 of the time,  1.0% 1/4 of the time

We also looked at L1's FEBz% during the above sweep

          Hz         FEBz%    from Geographic Section 1
        ------   ------------------------------------------------
          500       2%  +- 1/2%
          518      55%

Now look at IBS events where all are REJECTED by the L15 Framework.  "i"

          Hz         FEBz%    from Geographic Section 5
        ------   ------------------------------------------------
           5.8        0.0%
         284          0.3%
         572          0.7%
         954          1.1%    Here there is 0.33% dead BX during L15 cycles.
         960         59  %    This is the limit of printing "i"s on screen.

Now look at That's_Me events where all are REJECTED by L15 Framework.  "n"
There are zero MFP events at this time.

                        FEBz%
          Hz      from Geographic Section 5       Dead BX During L15 Cycles
        ------   ---------------------------    -----------------------------
           5.7        0.0%                              0.1%
          57          0.1%                              0.7%
         238          0.3%                              3.0%
         477          0.5%                              5.9%
         954          1.0%                             11.8%

    During the above scan there were zero L15FW exits via Timeout.
    During the above scan we learned (or re-learned) that during an L15 FW
    Decision Cycle the Front-End Busy scalers do not increment.

Now look at That's_Me events where all are CONFIRMED by L15 Framework.  "N"
Do this by setting a CTFE Pedestal DAC up high.  We used  0,169,33,0
and moved it from 35 to 255.   Used a flat over eta,phi 2.5 GeV
Ref Set in L15CT.   There are zero MFP events at this time.

                        FEBz%
          Hz      from Geographic Section 5       Dead BX During L15 Cycles
        ------   ---------------------------    -----------------------------
           5.7        0.0%                              0.07%
          57          0.1%                              0.71%
         238          0.2%                              2.94%
         475          0.5%                              5.84%
         550          0.8%                              6.9%   - L1 FEBz = 60%

Note that with the "n"s  we ran at high rate (238, 477, 954 Hz) for sometime
with no problems  i.e. no "e"s.
With "N"s  we ran at 238 Hz and after perhaps 2 minutes had one "e".  Then
running at 477 Hz for a couple of minutes we had another "e".

Now look at MFP events where all are CONFIRMED by L15 Framework.     "F"
We do this by keeping "N" running and setting the MFP ratio to 0.

                        FEBz%
          Hz      from Geographic Section 5       Dead BX During L15 Cycles
        ------   ---------------------------    -----------------------------
           5.7        0.4%                              0.01%
          57          3.9%                              0.14%
          95          6.5%                              0.23%
         149          9.7%                              0.35%
         285         19.5%                              0.7%


Now we begin work with the Error_Recovery routine.  First we note that
the previously valid wake up word and the previously valid Transfer to
214 word need to be before sending the IIOF2 interrupts to the 12
DSP's.  We cause an "e" by telling the ERPB_MTG to ignore its input lines.
This causes an "e" and all appears OK.  The second time that you tell the
ERPB_MTG not to follow its inputs you get another "e" and then the system
hangs with the 68k looking at Readout Control P2 looking at the DONE
line from the VBD executing code in the Complete VBD routine.   Abort
brings the 68k out at $95C7E.

Return the "old" (hang on error) 68K_Services code and try to diagnose one
of the "high-rate N" hangs.  First, try low-rate "N" to prove that everything
is working.  This doesn't work, we get "e" after 2 events.  Try again multiple
times, keep getting "e" very quickly.  Hitting the L1.5 Cal Trig with hardware
resets and re-downloading the MTG (as well as the Cal Trig) does not help.
The diagnosis is the same each time, LDSP A4 (but ONLY LDSP A4) did not get
some of its ERPB data.  This was determined by looking at the DSP_to_68K Status
longwords in VME memory via the BUGMON.  No DSP debugger was involved.

Next tie up the DSP debugger to more carefully study this error.  It happens
again very quickly, and we see that Rack 2 Total Et data for LDSP A4 is
incomplete.   The DMA List for this channel was 1 transfer from the end.
After another reload and restart, the problem appears to go away.  But not
for long...

Start high rate "N" tests with transfer to the host shut off.  After 2 minutes
at 238 Hz, it sticks.  Abort 68K and examine symptoms:  exactly the same (A4
Rack 2 Total Et not complete).  Run some more tests, including setting a L1
HD Trigger Tower to have some Et (using pedestal DAC).  The "big" tower walks
around a lot in A4, but not in A1 (where it shows up in the Rack 1 Total Et).
The problem is eventually traced to another bad CRC->DSP cable.  Replace the
cable and look for the big tower:  It is stable in both A4 and A1.  Run at 28
Hz (transfer to the host turned on) for 20 minutes with no errors.

Begin high-rate "n" tests.  At 286 Hz in to L1.5 Cal Trig (0 Hz out), we hung.
Look at the DSP_to_68K Status words--NO LDSPs have seen their ERPB data.  This
is consistent with the hangs that Dan saw last month.  We were using the
SKIP_TEN_BX And-Or Input Term when the hang occured.

Try to do the Bill Cobau download.  Eventually we have success, but this is
still clearly a weak link in the system.  Collect a few hundred events on
disk for Dan Owen.  The L1.5 Cal Trig was accepting some events just based
on beam noise, and Dan wanted to examine this phenomenon.

Load the crude Error_Recovery version of 68K_Services.  This version, upon
detection of an error, just clears the Transfer and Wake Up Words, hits the
DSPs with the Error interrupt, and the jumps to Orbit Master (i.e. it does not
try to transfer the event).  Doing Error Recovery in this way causes the L2
Sequencer console to generate one Token Loop Count Overflow for Crate 51 (us).

We leave the Error_Recovery version of the code installed and let it run for
30 minutes.  It successfully recovers from about 10 errors.  We actually were
there to see one "e" appear on the screen.

We leave the L1.5 Cal Trig powered up and 68K loaded, so people can do download
tests (or whatever) over the next few days while we are gone.  We have
removed the PC.  We leave the L15CT VBD  NOT bypassed.
..............................................................................

29-JUN-1994         At: Fermi       Topics: install repaired CRC, install
                                            last PROM in ERPB MTG, clean up
                                            cabling and patch panels at the
                                            back of M124, simple data transfer
                                            tests on new CRC, VSB hang
                                            debugging.

Installed the repaired CRC card in the lower CRC slot in the CRC/MTG backplane
in M124.  Dan cleaned up the cabling and patch panels in the back of
M124 during this installation.  The repaired CRC was able to grab all of
the tokens from the DSPs (after running Load_12_Parameters, all 12 DSP
Status to TCC Longwords were $00000000).  We also installed the last
ERPB MTG Type 2A PROM (for slave channels for |eta| 17..20) in the ERPB
MTG at this time.

We then arranged the L1.5 Cal Trig to "parasite" from a low-rate,
high-prescale Specific Trigger.  This was done by replacing the Start
Digitize to Geo Sect 5 input to ERPB MTG Channel #6 with just the
Specific Trigger Fired signal for the chosen Specific Trigger (#1).  We
then set the "FORCE >=1 Term" switch in the front of the VME crate.

We then wanted to prove that all DSP "ERPB Input" Comm Ports were
receiving 128 bytes of data on each event.  We loaded the normal DSP
code into the DSPs using the JTAG port.  We DID NOT use any eta-coverage
reducing TAKE files on the debugger.  We ran (with beam) an old "Steve
private" version of a baby DSP control program in the 68K.  This program is:

    [TRG_C40.FIRST_RELEASE.SOURCE_68K]NO_CHECK_12_DSP_CONTROL.ABS

Before running this program, use 135BUG to write $EF (as a byte) to
$FFFFF047.  This enables MVME-214 #2 to be a VSB (Load) Buffer.

The L1.5 Cal Trig happily cycled away at 0.6Hz (the rate of our selected
Specific Trigger), without trying to touch the VBD or read out, etc.
This demonstrated that all ERPB input channels were receiving enough
data to generate "DMA Finished" interrupts.

We then started looking at the data more carefully, trying to perform
rationality checks on the incoming data.  During this operation, we saw
another "hang" which was (as far as we can tell without the Logic
Analyzer) exactly like the hangs seen yesterday at MSU.  That is, the
type of hang which occurs if the Hydra tries to access non-existent VSB
memory.  Here is how we got into the hang:

        - DSP's were cycling with no problem
        - stopped triggers flowing to L1.5 Cal Trig (switched into
          "not me" mode with front-panel switches)
        - ABORTed the 68K
        - looked at some valid VSB addresses in an already-existing
          debugger window on DSP B2 (no problem)
        - opened a NEW debugger memory window on B2, pointed at
          different (but still valid) VSB memory---HUNG HERE

Looking at the VSB bus (only with the FLUKE meter), the hang looked
identical to the "MSU hang":

        - PAS* low
        - ERR* 1.529V (on DC setting--this is how it looked on the
                       FLUKE during the MSU hang)
        - BUSY* low
        - BREQ* high

Trying to do a VSB read via 135 hung the 135 in the traditional fashion.
Reading the Hydra's 2400 chip (0x3ffe8000) both reported the error and
terminated the cycle (i.e. the 135 successfully completed its read and
the Hydra "mem" windows updated correctly, showing that they were not
pointed at invalid addresses).

During the hang, the Ironics which enables VSB buffers was read.  It
showed that Buffer #2 was enabled for VSB.  This still looks flaky and
it would be nice to see that if we remove the JTAG debugger then the
hang never occurs.

Continuing the data sanity checks, we found that DSP C4 Rack 2 Total Et
data had the MSBit stuck high for all etas/phis.  DSP C1 Rack 1 Total Et
(i.e. the other copy of these etas) looked OK.  The problem was traced
to a bad CRC->DSP cable--the MSBit appeared open.  The cable was
visually 100% OK, and twisting the end a little bit allowed the signal
to propagate.  This cable was removed and replaced with a new cable.
This solved the "bit stuck high" problem.

Note that the Reference Set data is not exactly correct in the parameter
files on the PC.  These Reference Sets are not used when the L1.5 Cal
Trig has been loaded by TCC.

We loaded the new TCC code and LOADed and STARTed the L1.5 Cal Trig purely
from TCC.  We proved that the 68K_Services code can detect a "Global Stuck
waiting for D3" error by halting a Local DSP.  Then we detached the PC from
the L1.5 Cal Trig.

We ran a few "rate" tests:

        MFP         Fire        GS 5        Transfer    L1.5 Dead
Mode    Ratio       Rate        FEBz        Rate        Crossings
----    -----       ----        ----        --------    ---------
n        10          57 Hz      0.4%        5.71 Hz
n       100          57 Hz      0.1%        0.57 Hz     0.7%
n       100         570 Hz      1.6%        5.57 Hz     7.1%
I       100         0.57 Hz     0.0%        0.57 Hz
I       100         5.7 Hz      0.0%        5.7 Hz
I       100        22.9 Hz      0.0%       22.9 Hz

What we really need to do is to run, independently and alone, all cases and
plot rate vs. FEBz and L1.5 Dead Crossings.  This is simple for "non-transfer"
triggers, but we can't transfer to the host at high rate.

Note that, when we were running "n"s at 570 Hz, every "F" was followed by an
"N."  This makes sense, because the L1.5 Cal Trig could not respond to the
trigger following the "F" (because it was busy doing its 1.5 millisecond
transfer to the 214), so the L1.5 Trigger Framework times these triggers out.

We also got "e"s on the screen twice.  These "e"s seemed to be related to
changing the trigger configuration, but we need to really understand these.
..............................................................................

28-JUN-1994         At: Fermi       Topics:  Install *BGIN and *BGOUT
                                             pullups on Hydra-II cards.

Installed the (backplane-mounted) VSB *BGIN and *BGOUT pullup resistors on
all Hydras in Crate 0.  These have a 2.7k Ohm resistors from BGIN* (A31) to
Vcc (B32),  a 2.7k Ohm resistor from BGOUT* (C32) to Vcc (B32),  and a LED
with a 470 Ohm resistor from Vcc (B32) to GND (B31).  The pullup resistors
on the BG lines are the same as is used on the MVME135-1 cards.  There on
no pullups on the Hydra-II cards.  The VBS2400 uses an open collector output
on the BGOUT signal.
..............................................................................

Date: 28-JUN-1994      At: MSU         Topics: Try to produce VSB hang
                                            conditions while watching
                                            VSB with Logic Analyzer

In the test backplane at MSU, we tried to produce hangs with symptoms
similar to the hangs seen at Fermi last week.  We were able to produce a
hang with similar symptoms by trying to access non-existent VSB memory
with the Hydra.  When looking at the backplane with the Logic Analyzer,
we saw:

        - PAS* low  (active) (this contradicts what we THINK we saw at
                              Fermi last week)
        - BUSY* low (active)
        - BREQ* high (inactive)
        - ERR* went low 102 us after PAS* went low (i.e. the 135
          timed-out the transfer), then high 102 us after that, and
          continued oscillating in this fashion forever)

This indicates that the Hydra did not correctly deal with the ERR*
signal going low--this is supposed to indicate a VSB bus error and the
active master should terminate the cycle.  Reading the 135 2400 register
we saw that it did NOT record a Timeout or a Bus Error (why???), nor did
it generate a Bus Error message on the screen (the best guess is that it
will not generate a Bus Error message unless it actually initiated the
bus transaction).  Reading the Hydra 2400 register both returns the
ERROR* bit cleared (indicating that a bus error had been detected at
some time) and ALSO TERMINATED THE VSB CYCLE.  Trying to read a valid
VSB address from the Hydra did not terminate the VSB cycle.  Trying to
read a valid VSB address from the 135 hung the 135 in the same way we
had previously seen (no BUGMON message, ABORT pushbutton ineffective).
This occurs because there is no timeout on a VSB Mastership transfer
when using the MVSB2400 chip.  Breaking the VSB cycle (by reading the
Hydra 2400 register) then allows mastership to transfer and the 135
concludes its cycle normally.

This hang condition was produced both by using the Hydra debugger to
read the VSB locations (and carefully reading only one VSB location,
verified by watching the Logic Analyzer), and also using a baby DSP
program (still running under debugger control) to read the (invalid) VSB
location.

Note that, when the 135 tries to read an invalid VSB address from either
user code or BUGMON, the VSB cycle times-out at the correct time, is
terminated by the 135, and a Bus Error message is printed on the 135
tube.  Note that the Crate Controller (in the 135's 2400) sets ERR* low
in both the Hydra-initiated "invalid" read and the 135-initiated
"invalid" read.  I.e. it never asserts ACK* in response to a VSB bus
timeout.  Note that when the 135 tries to read an invalid VSB address,
ERR* is low for only about 175 ns.  There is no way that the bus cycle is
being terminated by BUGMON intervention (i.e. the cycle is not concluded
via the same mechanism that we were able to conclude the Hydra-initiated
cycle, by reading the MVSB2400 Status Register).  The 135 can't be doing
anything meaningful in 175 ns.

Sometime it would be good to put two 135's in a VSB backplane (or swap
the 135 and the Hydra) and see what a 135 that is NOT the Crate
Controller does in response to a Crate Controller bus timeout.  I.e. would
it act the same as the Hydra, or is the Hydra wired up in some funny way?
All of this processing should be internal to the MVSB2400 chip so it is hard
to see what Ariel could have done to screw this up.
..............................................................................

Date: 22,23,24-JUN-1994   At:  Fermi

22-JUN-1994         At: Fermi       Topics: install baby VME rack,
                                            2-headed TCC, upgraded
                                            Term Select Card, first
                                            meeting of TCC vs 68K vs DSP

Replaced D0HTCC with the new 2-headed TCC.  We currently
have only one 2-headed TCC and need to convert the "old" TCC to 2-headed
operation.  The clamp on D0HTCC has been removed, so it is fast to remove
and replace D0HTCC if necessary.  Installed the baby VME crate (with the
pVBA and VI-monarch cards) in the BA-23 box, and connected the VI-monarch
to the (newly-installed) VI-serf in the L1.5 CT VME crate.  Also installed
the upgraded Term Select P2 Paddleboard in the L1.5 CT VME crate.  Loaded
the new (2-headed [v6.0??]) TRICS code in D0HTCC (and also verified that
the 1-headed TRICS code [v5.3??] can run in this 2-headed box).

We then tried to LOADCODE into the L1.5 Cal Trig.  This was the first
meeting of TCC and 68K_Services, and a few problems were found.  After
making a special version of 68K_Services (which moved the 68K-to-TCC
Status Block from its actual home to a location where was expecting it),
we were able to successfully LOADCODE to the L1.5 Cal Trig (i.e. no errors
were generated).

Weird behavior of DSPs with debugger attached:

    1) at one time, everything was fine
    2) then DSP A4 can load code, but not hear load param interrupt from TCC
    2b) load param 68k program shows the same thing
    3) the sanity/config startup program on Hydra A hangs at DSP 4
    4) only after "parking" A4 at PC=%X00000040 can the sanity/config run thru
    5) DSP A4 still fails to hear the load param interrupt
    6) exiting the A4 debugger session made no difference
    7) exiting all debugger sessions made no difference either
    8) shut down the PC and unplug pod, A4 now hears the load param interrupt,
    8b) but A1 does not loadcode
    9) run the sanity/config and all Hydra A DSPs 100% happy again.

   10) reconnect PC, A1 does not loadcode
   11) see where A1 is: it has bad memory controller setup words
   12) fix memory ctrl setup words helped see code, but A1 still won't load code
   13) run the sanity/config and all Hydra A DSPs 100% happy again.

After the LOADCODE success, we tried to START the L1.5 Cal Trig.  An "EVE
Block Copy" bug in TRICS prevented us from STARTing any Local DSPs other
than A1, B1, or C1.  We thought we could first execute LOAD_12_PARAMETERS
to load the DSPs, and then START would produce no errors.  That trick did
get the DSPs to load their Parameters, but when the TCC downloads parameters
to the Shared Dual Port Memories, it overwrites the DSP_to_TCC and DSP_to_68K
Status Longwords, so START still produced errors.  Note that this is actually
the desired, designed-in behavior of START.

Note that, to run the L1.5 Cal Trig with reduced eta coverage, a TAKE file
is required.  That means that manual intervention via the PC-based debugger
is required sometime after the LOADCODE occurs but before the first event
will flow through the L1.5 Cal Trig.

We saw some problems with access of the VSBs via DSP Debugger.  More on this
later.

After the accelerator broke, we continued to check TCC vs. 68K vs. DSPs.
We were able to verify that the Pass_one_of_(N) counter was correctly read
by the 68K (after a minor code change...) and that the Frame Parameter and
Tool Parameter sections of the Data Block were being correctly programmed
by the 68K.

We then tried to actually take some data with the L1.5 Cal Trig, to prove
that LOADing and STARTing via TCC were actually working.  That failed--after
LOADing and STARTing, the L1.5 Cal Trig would "transfer" 2 events and then
hang with 100% Front-End Busy, and the 68K would not respond to its ABORT
push-button.  After trying various combinations of TCC-based STARTing and
also using Load_12_Parameters, and additionally removing lines from
68K_Services, we thought that the problem was related to the Load_Parameters
interrupt processing in the 68K.  We removed the VSB writes from
Load_Parameters, and TCC-based STARTing appeared to work OK.  We then restored
the VSB writes, but left the VSB pointing at Buffer 2 (not Buffer 1), but
kept the Which_214_Is_Load_Buffer flag pointing at Buffer 1.  Again, this
worked OK.  Why??!?

Some further notes on the above problem:  it appears that the first VSB
writes by the 68K AFTER the VSB writes in Load_Parameters were "hanging."
This is funny for 2 reasons:  (1) if something is being screwed up in
VSB by the Load_Parameters interrupt, why should the VSB writes in
Load_Parameters succeed?, and (2) note that the VSB bus isn't supposed to
hang on a normal write cycle, but instead the MVSB2400 on the 135 is programmed
to bus-error after some amount of time.  Is there something funny in mastership
arbitration?  This also seems unlikely, because the Global DSP should not have
requested mastership at this time.  This problem is probably related to Steve's
earlier difficulties accessing VSB space via the DSP debugger.

Also, the new TCC with the new code went into KERNEL EDEBUG.  See the TCC
logbook.


23-JUN-1994         At: Fermi       Topics:  More on VSB hang problem, install
                                             all remaining CRC->DSP cables,
                                             copy M111 data to M112 for
                                             a test

Even more notes on the VSB hang problem:  Today we were totally unable to
re-create the VSB hang problem.  We returned to the "old" 68K_Services
and did things which absolutely hung the system last night (RESET, g 95000,
LOADCODE, START, ABORT, take eta_1_8.tak, prun -r, g 95000).  This worked
fine.  We were not able to try to transfer events, because the accelerator
was running all day.  We need to understand this problem: what is the real
problem?  We may only be looking at symptoms.

We met with Bill Cobau for a few minutes to make a special run request,
and also describe the test that we want to do.  Bill claimed that he would
make up some trigparse files.  We asked him for the output from COORSIM
for these files, as a way to verify that the COOR-to-TCC messages would
be rational.  He did not provide these.  None of us want to waste beam
time debugging COOR-to-TCC messages, so we did not actually request the
special run.

The new TCC with the new code went into KERNEL EDEBUG again, in a similar
(but not identical) way as yesterday.  We put the old code in the new
TCC as a way to try to see whether the problem is in the new code or the
new box.  See the TCC logbook.

We installed the remaining CRC->DSP cables.  This installation is now
permanent.  It will be a major job to remove any Hydra-II cards now.

Dan made a "splitter" cable to feed the ERPB data from M111 to the CRC
channels for both M111 and M112.  This way, we will have data being
fed to all DSPs, and we will not need to do anything "funny" during
TCC-driven LOADs and STARTs.

Steve fired up the DSPs to see whether they could run with the "old"
No_Check_12_DSP_Control program.  The answer is yes, as long as one of
the VSB buffers is enabled "by hand."  In the process of making this test,
we noticed that the DSPs connected to the new CRC were still making Token
Warnings at Load_Parameters time.  After poking around, we realized that the
problem was that the new CRC is missing its 1 MHz crystal.

We could not find a 1 MHz rock, so we installed a 2.5 MHz rock in the CRC
instead.  This did not work (see further entries for 24-JUN-1994).


24-JUN-1994         At:  D-Zero         Topics: More attempted CRC repair,
                                                continue "VSB hang" diagnosis,
                                                Try to fully download L1.5
                                                Cal Trig from Cobau-generated
                                                file.

We continued the "VSB hang" diagnosis.  We used the "canonical" 68K_Services
code (not the any of the funny "this.abs" code).  Upon first starting the
L1.5 Cal Trig from power-up, we did not experience the VSB hang.  We did
experience the VSB hang once after trying to start the L1.5 Cal Trig without
first using the "take eta_1_8.tak" command (the hang occured after we recovered
from the normal "D3 hang" that is expected when the reduced eta coverage
take file is not used).  We looked at some VSB signals both before and
during the hang.  The only thing that looked funny (both before and during
the hang) was the *BGOUT from Hydra-B.  It was at about 2-2.8V, i.e. not really
a good TTL level.  This is also what the *BGOUT from Hydra-C and Hydra-A were
doing.  We removed the Bus Grant jumper between Hydra-B and Hydra-C.

We tried single-stepping through the 68K_Services code.  We determined that
the hang occurs immediately after the 68K tells the Global DSP to transfer
data to the MVME214.  Note that this is the first VSB write in "cyclic"
processing (68K does VSB writes during Load_Parameters processing...these
appear to work).  Note also that the Global DSP is requesting VSB Mastership
at this time.  Steve looked at the DMA List to notice that NO data had been
successfully transferred to the MVME214 by the Global DSP.

We re-started the L1.5 Cal Trig, and stopped the 68K just before it tried to
tell the Global DSP to transfer to 214.  Using the DSP debugger, we looked
at VSB memory via DSP (i.e. request bus mastership again).  We expected a
hang, but it did not.  We were able to transfer VSB mastership back and
forth between 68K and DSP.  Also, when we released the L1.5 Cal Trig to run
at speed, there were no hangs.  We have not seen a hang since.  What is going
on?  Note that in our single-step test, the DSP got VSB mastership not in
conjunction with a DMA list, also a READ (rather than a WRITE) followed the
Mastership transfer).

We tried to use the Bill Cobau-provided configuration file to LOAD and START
the L1.5 Cal Trig (as well as set up the rest of the data acquisition system).
There were a few errors discovered during this test, but we were able to
work around most of them.  This configuration file was more complex than
we would have liked, though, because it had multiple Specific Triggers
digitizing Geo Sect 5.  This caused the L1.5 Cal Trig to hang, because the
Path Select paddleboard was not being used.

We then looked again at the lower CRC, and noticed another jumper wire (the
VME_RESET-to-VCC wire) was missing.  Steve installed this jumper, and quickly
examined the CRC for other problems, and then re-installed the CRC.  It still
did not grab tokens from the DSPs.  We tied a "lower CRC" DSP up to the upper
CRC to verify that the problem was not in the DSP.  We tried making a crude
switch to replace the crystal.  This did not work either.  Upon (another)
examination of this CRC, we notice that it does not have the Token Grabber
pin #1-to-VCC jumper cosmetic traces cut (as described in the CRC Description).
This would absolutely cause the Token Grabber PALs to not grab Tokens.  We
probably wrecked the 2.5 MHz crystal, but what problems were caused by the
switch, which tried to ground these pins?  Steve checked the ECOs on this
card vs. the ECOs in the CRC Description, and the un-cut traces were the only
remaining problem (that Steve could find).  We are returning this card to
MSU for repair and Token Grabber PAL testing.  This test is easy to do at
MSU.

We then switched to using the Path Select Paddleboard.  We were able, using
4 different TAKERs, to produce all normal paths (F,N,n,I,i) through the L1.5
Cal Trig.  This was done by programming 4 Specific Triggers, which each always
produced the same result.  The MFP ratio was set to 100.  We started examining
rates and dead times, but ran into problems during this process, probably
because the Specific Triggers did not have "SKIP_n_BX" And-Or Input Terms
defined, which interferes with the Transmit-Trigger generation of the L1.5
Cal Trig.  Here are the rates we found:

  ST 0    ST 1     ST 2    ST 3
  Normal  Normal   IBS     IBS                  L1          L1
   Fail    Pass    Fail    Pass                 Trigger     TRANSFER
   (n)     (N)     (i)     (I)      FE Busy     Rate        Rate
  [Hz]    [Hz]    [Hz]    [Hz]      GS 5        [Hz]        [Hz]
--------------------------------------------------------------------
   6       5.6     5.2     5.7      0.1%        22.4        11.4
  61       6      54       5.5      0.3%       126          12

With some energy in the detector (but only protons), and using the Skip_10_Bx
And-Or Input Term on each of the Specific Triggers, we took the following
data:
                                                L1          L1
                                                Trigger     TRANSFER
  ST 0    ST 1     ST 2    ST 3     FE Busy     Rate        Rate
  [Hz]    [Hz]    [Hz]    [Hz]      GS 5        [Hz]        [Hz]
--------------------------------------------------------------------
 474       8.2    430      9        14.7        916         153

Note that the ST's did not map directly to NnIi, but some of our desired
n's became N's.  Also note that we were not using the Term Select Paddleboard
during this short run.  Instead, we had thrown the >=1 Term Selected switch.
The only characters on the 68K_Services screen were all lower case "n" and
lower case "f".  This is consistent with throwing the >=1 Term switch.  It
is good to note that, using Skip_10_Bx, we were not hanging the L1.5 Cal Trig.

..............................................................................

Date: 17-JUN-1994   At: Fermi   Topics: Replace CAT2 at Tier 1 Py eta -13:16
                                        phi 17:24,  Install/rework L15CT cables
                                        into M124,  Power Supply voltages in
                                        L15CT,  Install Bus Grant wire in upper
                                        L15CT crate, Install Panduit plastic
                                        cable try on front of M124  C-channels,
                                        Verify D0HTCC - BA23 configuration.

During the day of 16-JUNE-94, We had some mail messages from TRICS about
problems initializing some L1 Cal Trig registers.  Looking in the Trics_Log
one saw errors at:

CBUS 1    MBA 207    CA 20    FA 16, 19, 22, 25, 28, 17, 20
e.g.    WRITE   0       READ    63

and  %% time: 17-JUN-1994 01:00:40.64
E-HIO/HDB%COORini% Failure Writing  63 @ cbus 1 mba 207 ca 20 fa  23 read   0
E-HIO/HDB% (cont) Data = Out 00111111 In 00000000 Mask= W 00111111 R 00111111
E-HTT/ODB%COORini% Failure Programming Lrg Tile NEG,E_13_16,P_17_24,REF_1
E-INI/ODB%COORini% Failure Initializing Large Tile  NEG,E_13_16,P_17_24

These errors are from the CAT2 card that services  eta -13:-16  phi 17:24
PY Tier 1.   Pull CAT2 SN#254 from service at eta -13:-16 phi 17:24 PY Tier 1.
Try using CAT2 SN#176.  It makes lots of readback (and other?) errors. Pull it.
Install CAT2 SN# 178.    SN#178 looks OK so far.   Return CAT2's SN#254 and
SN#176 to MSU for repair and testing.

While looking in the Trics_Log I also see from time to time:

%% time: 17-JUN-1994 01:50:51.64
S-INI/ODB%COORini% Initializing all Specific Triggers
E-HIO/HDB%COORini% Previously  11 instead of  15 @ cbus 2 mba 129 ca  8 fa  14
E-HIO/HDB% (cont) 00001011 i/of 00001111 Msk= W 11111111 R 11111111, Writing 240

Install the power cables for the ERPB-MTG and the 2nd CRC.  Straighten up the
cables coming into the top of M124 and their flow down the side.  Check L15CT
power supply voltages:

    On the CRC card:   +5.024 Volts  and  -4.506 Volts
    On the MTG card:   +4.968 Volts  and  -5.194 Volts
    Power Pan test points:   +5.196 Volts,  -2.027 Volts,  -4.520 Volts
                             and  -5.406 Volts

Install the Bus Grant Priority 3 jumper wire from Slot #9 (MVME135-1) pin
P1-B11 (BG3Out)   to   slot 1 (Vert Inter Slave from TCC) pin P1-B10 (BG3IN).
Verify the the BG3 jumper is removed from slot 8.

Install the plastic Panduit cable try mounted on an aluminum bar to the front
of the "C" channels on M124 to hold the rest of the CRC to DSP cabes.  I put
on only one of the two cable try sections because there is limited vertical
space.

Verify D0HTCC - BA23 physical stack configuration.   BA23 is on top.
4000 box can be removed without taking the BA23 and its shelf off of the
stack.  The clamp over the top of the 4000 box can be removed without taking
anything else off.

Remember to purchase a "A", "B", "C", "D" switch for the 68k and 3x Hydra's.

Take the Term Select P2 card to MSU to have the 16 address readout wires added.
Take all remaining CRC to DSP cables to MSU to have labels changed to get
ready to install them.
..............................................................................

Date: 8,9,10-JUNE-1994  At: D-Zero Hall  Topics: Test ERPB to DSP data
                                                 transport in the eta ranges,
                                                 +9:+12,  -9:-12,  +13:+16.
                      Edit L1.5 CT Driver Documents.   Cable installation work.
                      Move the CRC.   Install ERPBs/DCs in Racks M110 and M111.
                      Install the L15CT cables to M111 and M112 (ERPB-MTG and
                      DC to CRC).   Repair the CTFE that services -12,11 HD.

Dan put connectors on the M124 end of the DC --> CRC cables from M107,
M108, M109, and M110.  The DC->CRC cables from M109 and M110 are not
yet threaded through the clamp in M124.

Dan also installed the DC --> CRC and MTG --> DC cables for Racks M111
and M112.  These cables are not yet routed through cable clamps at either
end, but they do have connectors installed.

We removed the power cable for the ERPB MTG.  This cable will be modified
(made beefier) at MSU.  This is being done to allow us to equalize the +5V
between the CRC cards and the ERPB MTG.  Currently the +5V on the MTG is
about 300 mV lower than the +5V on the CRC.  When we re-install this cable,
we will also install the power cable for the "lower" Crate 0 CRC, and also
thread the L1 <--> L1.5 cables described above through the cable clamp in
the top of M124.

We did not have the 2nd CRC card, so we did plug the ERPBs into the "correct"
DSPs.  Instead, we used the following DSPs:

                   /--->    DSP A1  Rack #2
        M109    --<
                   \--->    DSP B3  Rack #1

                   /--->    DSP B3  Rack #2
        M108    --<
                   \--->    DSP B4  Rack #1

                   /--->    DSP B4  Rack #2
        M107    --<
                   \--->    DSP B1  Rack #1

We tested these 3 ERPB -> DC -> CRC -> DSP data transport paths at a low
trigger rate (about 1 Hz).  We used the standard "pallet" files:

        L15CT_PALLET_Mxxx_*.DAT     where xxx = Rack Number
                                            * = 80 (EM=TOT=$80 everywhere)
                                                7F (EM=TOT=$7F everywhere)
                                                55_AA (EM=TOT= $55/$AA
                                                       alternating)

        Recall that these 3 files test each bit high, each bit low, neighbor
        bits shorted, and also maximum switching between consecutive eta/phi
        time slices during the data transport phase

We found no shorted bit or switching problems in Racks M107, M108, M109 during
this test.

This test was done using the standard 68K_Services code in the MVME135, and
the Trigger Tower data was examined by hand from the DSP DeBugger (by looking
in the DeBug section of the Data Block).  We have only looked at a few
events this way.  We did not use any of Steve's old "Learn and Check" 68K
code, because that code only knows about the DSPs, it does not know about
the rest of the hardware which is now installed in the L1.5 CT VME Crate.

We have not yet looked at the M106 ERPB data since Dan installed the special
cable for the Trigger Towers at eta = -5, -6, phi = 13, 14, 16.  We still
have a stuck bit in Rack M103 (see the electronic logbook dated 22-25 MAR
1994 for details of this stuck bit).  This stuck bit is the only known
problem with the L1 Cal Trig to L1.5 Cal Trig data path (for Racks M103 to
M109).

We will lay out the CRC/MTG backplane in M124 as follows (note that this
involves moving the currently-installed ("upper") CRC, which we did):

    Slot #1 (top)       unused
    Slot #2             CRC for Racks M104, M103, M106, M105, M108
    Slot #3             unused
    Slot #4             CRC for Racks M107, M110, M109, M112, M111
    Slot #5             unused
    Slot #6             unused
    Slot #7             ERPB MTG
    Slot #8 (bottom)    unused

This will probably change if/when L1.5 Cal Trig Crate #1 is built up.

We installed (9-JUN) ERPBs and DCs in Racks M110 and M111.  The installation
in M110 is 100% complete, but M111 requires a little bit of clean-up work.
The DC-to-ERPB Parallel Timing Cable in M111 still needs a terminator, and
some cables in M111 need a final taping into position.  We still need to
install ERPBs and a DC in Rack M112.

The ERPB and DC installation seemed to go a bit faster this time, about
2.5 to 3 hours per rack.

We have done no data path testing of Racks M110 and M111.

When we applied power to the Cal Trig, we first turned on the upper backplane
in a rack, and then turned on the lower backplane.  In some racks, the DCs
had successfully downloaded (i.e. the yellow LED was not lit) the ERPB LCAs
without needing a push of the button, but in other racks they hadn't.  In all
racks, we still pushed the "download LCA" pushbutton on the DCs.  In Rack
M103, the yellow LEDs flashed OFF briefly when the button was pushed, but
then came back on.  This is the same problem we saw earlier on Racks M103,
M104, and M105.  The best guess is that there is a problem with the DC
(because ALL yellow LEDs in M103 acted the same).  These 3 DCs (M103, M104,
M105) are all "prototype" DCs, which have PCBs identical to the
"production" DCs, but are not assembled as nicely.  We have 13 production
DCs, so we could replace these cards the next time we have power off in
the Cal Trig.  Would that solve the problem???

Edit:

      L15CT_Shared_Dual_Port_Memory_Map.txt  to indicate the new larger
                                             68k_Services to TCC Status Block.

      Wakeup_L15CT_Outline.txt   to indicate that TCC does not write the Frame
                                 Parameter Section or the Tool Parameter
                                 section of the L15CT Data Block into the
                                 MVME214 memory modules but rather 68k_Services
                                 does this in response to a Load_Parameters
                                 interrupt.

      Status_68k_to_TCC.txt   to standardize the status codes from 68k_Services
                              to TCC and to add "un-stick DSP" information to
                              this status block.

      Services_68K_Draft_1.txt      to describe the "un-stick DSP" (which
                                    should really be called the "system-level
                                    automatic error recovery") steps.  This
                                    includes adding information in both the
                                    Unstick_DSP description (to describe how
                                    to unstick), and also in the house-keeping
                                    and linear code sections (to describe
                                    when to unstick).

      L15CT_Data_Block_Section_Layout.txt   to indicate that the Frame
                                            Parameter and Tool Parameter
                                            Blocks are written by the 68K
                                            (not the TCC) when it receives
                                            the "Load Parameters" Interrupt
                                            from TCC, and also to provide
                                            a DSP vs. VME vs. VSB address
                                            map of the Data Block in the
                                            MVME214.

      Initialize_DSPs_Via_TCC.txt   to clean up the details of TCC actions
                                    to the DSPs during "Wakeup" time.  Note
                                    that this file does not describe the
                                    overall system flow (that is the job
                                    of the Wakeup_L15CT_Outline.txt file),
                                    but only talks about the relationship
                                    between TCC and the DSPs at Wakeup time.

Install the ERPB-MTG cable to racks M111, M112.  This cable is 43 sections
long.  The extra length in this cable is randomly (but orderly) folded up in
the tray above M112.  Note the entry from 7-APRIL-1994 specifically about
the differences in ERPB-MTG cable lengths (to compensate for the Cal Trig
MTG cable length differences).  Sometime this needs to be looked at again
because it looks like the skew in cable lengths between M103:M106  and
M107:M110 may be backwards.  This +- 10 nsec effect does not make much
difference for 3.5 usec running but it may need to be looked at in the future
for high luminosity running.

Install the DC to CRC cables to M111 and M112.  The cable to M111 is 28
sections long and the cable to M112 is 31 sections long.

Repair the CTFE that services -12,11 HD.  This is CTFE sn#246.  About 2 or 3
weeks ago -12,11 HD drifted high by a couple of counts for a run or 2.  Joan
Guida watched it and it went away.  It drifted up again in the run last night.
Joan showed it to us and we could see it in TrgMon ADC Counts.  It was showing
15 or 16 counts.  This is very close to being in trouble.  We pulled it and
replaced the 0.1 ufd cap on Ch#4 HD.  This fixed the problem.  Also replaced
the cap on Ch#2 EM because it looked a bit funny on the Fluke ohm meter.
..............................................................................

Date:  7-June-1994     At: MSU    Topics:  Experiments with DSP DeBugger and
                                           TCC

Steve and Philippe tried loading and starting the DSPs from TCC both with
and without the DSP DeBugger.  We also tried various combinations of
starting, stopping, and disconnecting the DeBugger to see what problems
arise. Here is a summary:

    - Loading and Starting code from TCC, with the DeBugger present the
      entire time, works IF the DeBugger has left all DSPs (that will be
      Loaded and Started) in the RUNNING state

        - The safest mechanism is to begin running the DeBugger, verify
          that all DSPs are in the RUNNING state (or put them in the
          RUNNING state), and then issue one or two (or more) VME SYSRESETs,
          watching both the Dual Port Access LEDs and the Serial Port of
          each DSP to verify that the Sanity and Configuration Checker
          executes successfully.  Then verify that all DSPs are in the
          "do-nothing dead loop" of the Sanity and Configuration Checker
          (which is at about 2ff931h).

    - If any DSP has been HALTED by the DeBugger, there is NOTHING TCC
      can do to make the DSP run again.  I.e. the DeBugger HALT has priority
      over both the Boot Control Register RESET and the hardware VME SYSRESET.

        - In this case, either the DSPs must be put into the RUNNING state
          via the DeBugger, or the DeBugger must be physically removed from
          the system (i.e. turn the PC off and unplug the DeBugger pod from
          the PC)

    - The DeBugger cannot debug any DSP which has been placed in RESET
      by the TCC (via the Boot Control Register).  The DSP must be unRESET,
      either via the Boot Control Register, or through a VME SYSRESET.

    - The DeBugger can be both exited and physically disconnected from the
      DSPs (i.e. PC turned off and DeBugger pod unplugged from the PC)
      while programs are running on the DSPs, or while the DSP is halted.
      In either case, the DSP is left in the RUNNING state (so, for example,
      the TCC can Load and Start the DSPs).

    - The DeBugger can be physically re-connected and started while
      programs are running on the DSPs.  The general context of the
      DSP is not lost when this is done, but the DSPs may move from the
      RUNNING state to the HALTED state.  This is perhaps the known problem
      between PDM and the individual DeBuggers.  PDM will think that the
      DSPs are RUNNING, while the individual DeBuggers will think that the
      DSPs are HALTED.  The DSPs actually appear to be HALTED.  I think that
      in order to move the DSPs to the RUNNING state, it is best to first
      single-step each DSP a few times.  I have sometimes had problems
      telling the DSPs to RUN immediately after physically reconnecting the
      DeBugger
..............................................................................

Date:  6-June-1994     At: DZero  Topics:  Replace Power Pan in Upper Tier 1
                         and MSU           in M107.

About 10:30 Dan Owen called from Fermi.  The -4.5 in the upper Tier 1 Power
Pan in M107 had died.  He and Mike Matulik will work to replace it.  There
were 2 or 3 hours of the store left to go and Muon wants to use it.  Thus
they will turn off all of L1 Cal Trig but keep the FW's running.    This has
the problem that L1 VME Transfer Computer will find lots of  Pilot COMINT
errors  (really first read vs last read errors in L1 Cal Trig CTFE ADC data
errors). The 4 usec required to put these on the screen will crash L2 (i.e.
if it ever needs to resync a data cable it will never be able to,  a new
"feature" of L2 code).

So in TrgCur: there is a new special version of  RunMe68020.ABS  called
RunMe68020.abs_No_Cal_Trig_Error_Check.    They load this version of VTC code
to use while L1 Cal Trig is turned off.   The plan is to maintain this version
of VTC code to run when L1 Cal Trig is off.

Dan and Mike replaced the Power Pan within a couple of hours and TrgMon says
that L1 Cal Trig is running OK again.
..............................................................................

Date:  1:3-June-1994  At: D0-Hall  Topics:  Talks at the D0 Collaboration
                                   Meeting, Install Norm Amos scaler, Install
                                   ERPB's in M109, Fix ERPB cabling in M106,
                                   Files that can be deleted,

At the Collaboration Run Meeting Dan Owen gave a L15CT talk show 1x2,3x3
Electron results.  Presented stuff at the first Blazey UpGrade Trigger meeting.
Had the first meeting of the Electronics Board for Run II.

Files that can someday be deleted:

In TrgL15CTHST:
       CALOR$1_078267_01.X_ZRD01;1                   4095/4095   9-MAY-94
In VWork1:
       L15CT_THATS_ME_MFP_ZBD.TXT_29_APR_1994;1       405/405   29-APR-1994
       L15CT_THATS_ME_NORMAL_ZBD.TXT_29_APR_1994;1    320/321   29-APR-1994
       L15CT_ZBD_IBS_29APR94.TXT;2                     44/45    29-APR-1994
       L1_L15CT_PRISTINE_ZBD_RAW.TXT;2                721/723   29-APR-1994

This is about 5585 blocks.

Modify Scaler #1 on DBSC card SN# 06 so that it has both a hardware reset
input and so that its clock signal comes from an on board 10 MHz crystal
oscillator.  The details about how to make this modification are given in the
DBSC text file in TrgHard:[DBSC],  This is for Foreign Scaler #36 which
Norm Amos will use to show the location of the Main Ring Beam with every event
to the nearest 100 nsec tick.  This scaler is called "time_from_micro_blank"
and its hardware reset signal via the 9th Lemo from the top on the NIM to ECL
module in M122.  Note that there is a problem about the increment of this
scaler (from it 10 MHz oscillator) vs the Latch-Shift from the Framework
Main Timing MTG.

Install ERPB's into rack M109.  The DC is SN #3 and it still needs its DIP
switches set.  The top ERPB is SN# 64, the 63, 62, 61, 59, 58, 57, and the
bottom ERPB is SN# 56.  All the yellow lights went out when I pushed the
LCA load button on the DC.

In rack M106 the -6,14 Trig Tower has munged pins on bits 6 and 7.  See the
log entry from 6-APRIL-1994 for details.  Today I installed a special cable
to try to get the Trig Tower connected up.  Note this is why Trig Towers
eta -5 and -6  Phi 13,14,16 have not been reading out (i.e. all these cables
were left off to give access to -6,14.
..............................................................................

Date:  24-25-MAY-1994  At: Fermi  Topics: Beam run for L15CT 1k events, Ariel
                                  visit, TCC boot, Disconnecting the C40
24-MAY-1994                       Debugger, Term-Sel P2, Special ERPB Cable.
-----------
Ran in beam of lumiosity  2.8   See the TrgMon Dumps in VWork1:  TrgMon_Dump.
TXT_L15CT_BEAM_24MAY94.   About 1000 events on disk for this run  79226.

25-MAY-1994
-----------
Visit from Ariel sales person  Dion Messer.  Need to send her stuff about the
multi JTAG and more written stuff  e.g. IEEE and TI papers.

TCC or network or COOR have some kind of a problem.   COOR says timout
waiting for Acknowledge at the end of a run (79254).   Now if you do a
$ dirs  D0HTCC::DUA0:[Trigger]   then after 30 seconds it says
 -RMS-F-Net,  Network operation failed at remote node;  DAP code =  01F7 7C54
At the same time as this started 12 L2 nodes crashed and had to be retriggered.
Booting (just a NCP Trigger boot NOT a power cycle boot) fixes the problem.

More test of disconnecting the debugger from the DSP's.
   Say "Quit" to PDM and all DSP's are still OK
   Say  Shutdown the  IBM/PC  and all DSP's are still OK.
   Disconnected the cable from the PC to the Pod and all DSP's are still OK.
   Reconnected the cable from the PC to the Pod and all DSP's are still OK.

Pulled the Term Select P2 card to verify how it is wired.  I put notes on this
in the P2 note book.  Reinstalled this P2 but left power off to the L15CT VME
crate and to the ERPB-MTG, CRC.   Take PC back to MSU.

Recall that the ERPB servicing eta -5,-6 phi 13,14,16 needs a special cable
made for it to fix the backplane problem.
..............................................................................

Date:  23-MAY-1994     At: MSU    Topics:  Return Hydra-II to Ariel

Return Hydra-II  serial number 7010  (MSU#1) to Ariel for repair of its NMI
to DSP #2 problem.
..............................................................................

Date: 18,19,20-MAY-1994 At: FERMI  Topics: New DSP code 1x2,3x3, Repair a Trig
                                   Tower, Clip pins for more ERPB's, Replace
                                   Hydra-II "A", Tests runs and beam run.
19-MAY-1994
----------
Start running L15CT with the code that has the new ISR routines.  So far it
looks OK.

Repair Trig Tower  +9,13  EM.  It was reported at 200% in the pulser run.  It
had a Term-Attn with a cold solder joint.

Look at Trig Tower  +18,4  EM.  It was reported at 50% in a pulser run.  I
check with the test pulser, looked at its switching noise waveform compared
to its neighbors, checked with Fluke for shorts and opens.  All looks OK.

Worked with Dan Owen clipping pins.  M109 and M110  are now clipped.  Strung
in the ERPB MTG cable for M107:M110.  Strung in the DC to CRC data cables for
M109 and M110.

Pulled out DSP "A".  Pulled Ariel SN 7010  MSU SN#1,  Installed Ariel SN 7047
MSU SN#4.   This fixed the "A2" not responding to interrupt problem.

Install DC cards in M107 and M108.  At power up and DC button push all
SRPB's appear to load OK  i.e. the yellow LED's go out.

Recall that the ERPB at eta -5,-6 phi 13,14,16 has a problem and that I
should have replaced it today while I had the chance.  See  dsfasd entry.

20-MAY-1994
-----------
Ran overnight and stuck after 120k events.

68k_Services was looking at DSP B and came out at  $96030.

B2 Check Reported Comm Ports
   R6=0
   32, 0, 0, 19, 64, 64     <-- values in 4th DMA Cntrl LW for Comm Ports 0:5

A2 Wait for Previous DSP Data             C2 Wait_for_Previous_DSP_Data
   R6=0                                      R6=0
   32, 0, 0, 19, 20, 20                      32, 0, 0, 19, 20, 20

A1 Check_Reported_Comm_Ports              C1 Wait for Sync
   R6 = 0
   0, 20, 0, 20, 20, 20,

Start again and Hang after about 5500 events at 14 Hz.  It had just finished
an "F" transfer.  68k-Services was looking at DSP B.  It came out at $9603A.

A1 Check_Reported_Comm_Ports                    C1 Wait for Sync
   R6 = 0                                          R6=0
   0, 20, 0, 20, 20, 20,                          0, 20, 0, 20, 20, 20

A2 Wait for Previous DSP Data                   C2 Wait_for_Previous_DSP_Data
   R6=0                                            R6=0
   32, 0, 0, 19, 20, 20                            32, 0, 0, 19, 20, 20

A3 Wait for Sync                                C3 Check_Reported_Comm_Ports
   R6=?                                            R6=0
   20, 20, 20, 20, 0, 19                           20, 20, 20, 20, 0, 19

A4 Wait for Sync                                C4 Wait_for_Sync
   R6=0                                            R6=0
   20, 0, 20, 0, 20, 20                            20, 0, 20, 0, 20, 20

                      B1 Check_Reported_Comm_Ports
                         R6=0
                         0, 20, 0, 20, 20, 20

                      B2 Check Reported Comm Ports
                         R6=0
                         32, 0, 0, 19, 64, 64

                      B3 Check Reported Comm Ports
                         R6=0
                         20, 20, 20, 20, 0, 19

                      B4 Check Reported Comm Ports
                         R6=0
                         20, 0, 20, 0, 20, 20

Ran until event 26621 when I was asked to stop it for other tests.

Ran in beam  (run 79041 and run 79044 put 30 events on tape.  L1 was set for
1 EM Trig Tower over a 2.5 GeV threshold in the eta range -6:+6.  L15CT was
set for 1x2 EM of 5 GeV and ratio to 3x3 Tot Et of 0.8   Beam was a very low
luminosity 6x1 store.
..............................................................................

Date:  11,12,13-MAY-94 At: Fermi  Topics: More no beam tests of L15CT,
                                  Investigate problem with MFP events,
                                  Books needed at Fermi
11-MAY-1994
----------
Setup the L15CT up as last week.  Everything is the same except for
68k_Services code.   Running 50:50 pass/fail (TAS No driven) and MFP_Ratio of
3.  All low rates (<15 Hz) all appears OK.  At higher rates (28 Hz) then things
hang after about 30 seconds of running.  At the point when things hang: It is
an MFP event being processed. The previous event was either a normal event that
passed and was transfered to L2 or a normal event that was rejected.
68k_Services is waiting for GDSP to say that it is at step D3.  GDSP is at
04A6044Dh  Check_Reported_Com_Ports:

When running 50:50 pass/fail (TAS No driven) and MFP_Ratio of 3,
I see the following:
                        Hz L1   FEBz%   DeadBX%
Note: Only one          -----   -----   -------
out of every              5.7    0.1%    0.05%
three events             11.5    0.3%    0.10%
is actually              20.5    0.5%    0.17%
rejected.                28      0.7%    0.24%


When running 50:50 pass/fail (TAS No driven) and MFP_Ratio of very very big,
I see the following:
                        Hz L1   FEBz%   DeadBX%
Note: One               -----   -----   -------
out of every             11.5    0.0%    0.13%
two events               28.6    0.0%    0.33%
is actually              57.3    0.1%    0.66%
rejected.                95.4    0.1%    1.11%

Books that I can not find at D-Zero Hall and that need to come here: TI C40
book,  Ariel Hydra-II book.

12-MAY-1994
-----------
It appears that when we "hang" in a MFP event that it may be DSP C3 that that
is causing the problem.  It appears that C3 may not be sending the MFP data to
C2 and thus C2, having received only the Object Lists, "hangs" waiting for
MFP data from C3.  C3 may not be getting told that it is an MFP event or it
may be forgetting this.

To see where things are during a hang do the following:

Examine DSP B2 register R6.  It has the form:               Data from DSP
                                                            -------------
                                                  DSP -->    C2 B3 B1 A2
                                                             -- -- -- --
    Value if all data has been received from this DSP -->    42 42 42 42
    Value if still waiting for all data from this DSP -->    00 00 00 00

Examine DSP C2.  Where is it in its program?  Is it at  Wait_for_Previous_DSP_
_Data ??   Examine memory location  LG_Xfr_from_Prev_Status_Loc.  This location
holds the status of the transfers to C2 from C1 and C3.
This has the format:                                        Data from DSP
                                                            -------------
                                                  DSP -->          C3 C1
                                                             -- -- -- --
    Value if all data has been received from this DSP -->    00 00 7F 7F
    Value if still waiting for all data from this DSP -->    00 00 00 00

Steve makes a new version of the  L_Scan.A40  routine.  This new version
saves the most recent Wake_Up_Word and the next to most recent WUW in two
memory locations.  It is necessary for us to learn what DSP C3 thought that
it was told to do on the L15CT cycle that "hangs".  Adding this explicit
memory of the WUW is necessary because by the time that we can get a look
at C3, it has already gotten to Step D15, and has erased the other direct
indicators as to whether it received a MFP or a normal WUW.

These two memory locations are:   WUW_N_Minus_1_Loc, and  WUW_N_Minus_2_Loc.

In  C:\C40CODE\WORK\LOCAL   rename the old  L_Scan.A40  to  L_Scan.ALD
Copy the new  L_Scan.A40  from floppy to  C:\C40CODE\WORK\LOCAL

Then assemble and link by:    LOCALASM all   and then   LOCALLNK all.

Reloaded the DSP's and verified that I could see WUW_N_Minus_1_Loc and
WUW_N_Minus_2_Loc in both A3 and C3.  They are currently at  C000 10B6h
and contain all zeros (have not run any cycles yet).

During the next running the idea is to check:  Does the "hang" always involve
DSP C3? and what does DSP C3 think that it was told to do.

To copy the new L_Scan.A40 from VAX to PC floppy I did:
Copying files from the VAX to the PC.  Put the file in Scratch:[Long.PCCommon]
on the online cluster with a filename that fits the PC 8.3 format.  This will
appear in drive H: on the PC.  Then just copy from H: to B: on the PC.

13-MAY-1994
-----------
Look at more "hangs" on MFP events:

"Hang"     Processor         Details
------     ---------    ----------------------

  #1           B2       PC = Check_Reported_Comm_Ports
                        R6 = 42420042   --> DSP B1

               B1       PC = Wait_for_Sync
                        WUW_N_Minus_1_Loc = 110001ff, 000000ff

               C1       PC = Wait_for_Sync
                        WUW_N_Minus_1_Loc = 110001ff, 000000ff

  #2           B2       PC = Check_Reported_Comm_Ports
                        R6 = 42004242   --> DSP B3

               B3       PC = Wait_for_Sync
                        WUW_N_Minus_1_Loc = bb0001ff, aa0000ff
                        Comm 4 \/ 0094008b    002ffc97    00000001    00000000
                        1000e0 /\ 00100082    00000000    c000049c    00000000
                        Comm 5 \/ 0cd4004b    00100091    00000000    00000005
                        1000f0 /\ 002ffc92    00000001    c00004b0    00000000

               B4       PC = Wait_for_Sync
                        WUW_N_Minus_1_Loc = bb0001ff, aa0000ff
                        Comm 1 \/ 0084008b    c00004d7    00000001    00000000
                        1000b0 /\ 00100052    00000000    c000049c    00000000

  #3           B2       PC = Check_Reported_Comm_Ports
                        R6 = 42420042   --> DSP B1
                        Comm 3 \/ 0cd4004b    00100071    00000000    0000010e
                        1000d0 /\ c0000b21    00000001    c0000474    00000000

               B1       PC = Wait_for_Sync
                        WUW_N_Minus_1_Loc = 990001ff, 880000ff
                        Comm 0 \/ 0084008b    002ffcb0    00000001    00000000
                        1000A0 /\ 00100042    00000000    c000049c    00000000

               A1       PC = Wait_for_Sync
                        WUW_N_Minus_1_Loc = 990001ff, 880000ff

  #4           B2       PC = Check_Reported_Comm_Ports
                        R6 = 42424200   --> DSP A2
                        Comm 4 \/ 0cc0004b    00100081    00000000    00000064
                        1000e0 /\ 002ffc01    00000001    c000048A    00000000

               A2       PC = Wait_for_Previous_DSP_Data
                        LG_Xfr_from_Prev_Status_Loc  =  00007f00  --> A1
                        WUW_N_Minus_1_Loc = 000001ff, ff0000ff
                        Comm 3 \/ 0cd4004b    00100071    00000000    0000010e
                        1000d0 /\ c000082a    00000001    c00004b0    00000000

               A1       PC = Wait_for_Sync
                        WUW_N_Minus_1_Loc = 000001ff, ff0000ff
                        Comm 0 \/ 0084008b    002ffc65    00000001    00000000
                        1000A0 /\ 00100042    00000000    c000049c    00000000

  #5           B2       PC = Check_Reported_Comm_Ports
                        R6 = 00424242   --> DSP C2
                        Comm 5 \/ 0cc0004b    00100091    00000000    00000064
                        1000f0 /\ 002ffcb0    00000001    c00004a2    00000000

               C2       PC = Wait_for_Previous_DSP_Data
                        LG_Xfr_from_Prev_Status_Loc  =  0000007f  --> C3
                        WUW_N_Minus_1_Loc = 990001ff, 880001ff --> MFP_Ratio = 1
                        Comm 1 \/ 0080008b    002ffcb0    00000001    00000064
                        1000b0 /\ 00100052    00000000    c00004aa    00000000
                        Comm 0 \/ 0cc4004b    00100041    00000000    0000021c
                        1000a0 /\ c0000c63    00000001    c00004c0    00000000

               C3       PC = Wait_for_Sync
                        LG_Xfr_from_Prev_Status_Loc  =  00007f7f
                        WUW_N_Minus_1_Loc = 990001ff, 880001ff --> MFP_Ratio = 1
                        Comm 4 \/ 0084008b    002ffce2    00000001    00000000
                        1000e0 /\ 00100082    00000000    c000049c    00000000
                        Comm 5 \/ 0cd4004b    00100091    00000000    00000005
                        1000f0 /\ 002ffcdd    00000001    c00004b0    00000000

               C4       PC = Wait_for_Sync
                        WUW_N_Minus_1_Loc = 990001ff, 880001ff --> MFP_Ratio = 1
                        Comm 1 \/ 0084008b    c00004d7    00000001    00000000
                        1000b0 /\ 00100052    00000000    c000049c    00000000


Steve make a new version of  L_Scan.A40  and of  L_ISR.A40   The new L_Scan
has a new memory location called  This_Event_Type_Loc  and it explicitly checks
the new valid WUW Flags Byte against "0" and "1".  The new L_ISR has routines
that save the Processor Status and then restore it when returning.

Get these new files from the VAX to the floppy using the PC and PathWorks.

In  C:\C40CODE\WORK\LOCAL   rename the old  L_Scan.A40  to  L_Scan.ALE
                            These is now both an  L_Scan.ALD  and  L_Scan.ALE
Copy the new  L_Scan.A40  from floppy to  C:\C40CODE\WORK\LOCAL

In  C:\C40CODE\WORK\LOCAL   rename the old  L_ISR.A40  to  L_ISR.ALD
Copy the new  L_ISR.A40  from floppy to  C:\C40CODE\WORK\LOCAL

Then assemble and link by:    LOCALASM all   and then   LOCALLNK all.
..............................................................................

Date: 4,5,6 May 1994   At: Fermi  Topics:  Installed two more DC cards and CRC
                                  to Hydra II cables, Installed L15CT Term
                                  Answer P2 to M103 L15 Framework Answer Done
                                  cables, Work with L15CT doing all pass, all
                                  reject, and full dance, write a tape with a
                                  couple dozen events from L15CT, Checked what
                                  PAL's were installed in M103 L15 FW, pickup
                                  a 2nd Vertical Interconnect master and slave
4 May 1994                        from Greg Cisco.
----------
Installed DC's in Racks M105 and M106.  We notice that the YELLOW LED
on the ERPBs in M106 turns on when power is first applied to the DC, but
then turns off when the "configure LCA" pushbutton on the DC is pressed.
The yellow LEDs in M103, M104, and M105 only briefly flash off when the
"configure LCA" pushbutton is pressed.  Also, the LCAs in M106 are NOT
configured until the pushbutton is pressed.  We are not certain what
happens in the other racks.  We put connectors on the DC->CRC cables for
these DC's, and also made a short extender cable for the MTG->CRC cable
for these DC's.  The old cable did not have a connector for a terminator.

We installed CRC->DSP cables for these next 2 Racks.  We now have 4 of the
5 channels on the first CRC occupied.  Cable routing space is getting very
tight, we are going to have to do something about it soon.

We also installed the Done/Answer cables to the L1.5 FW in M103, but did
not plug them in.  We don't have the Term Answer card finished yet (but
it's close).

The Answer Done cable from M124 L15CT to M103 FW is made in the following
way:
                                                             Upper L15CT
  M103 L15                                                      Crate
  Framework                                         Pin #1 end  +-----+
  +-----+ Pin 1 end                   --------------------------|Trm16|
  |Trm16|----------------------------'                          |ANSWs|
  |     |        34 conductor            .----------------------|Trm23|
  |ANSWs|        twist-flat      ------ <      .----------------|Trm16|
  |     |                                 \   /                 |DONEs|
  |Trm31|---------.--------------------.   \ /   .--------------|Trm23|
  | N.C.|_______.  \                    \   X   /               | N.C.|
  +-----+        \  \__                  \ / \ /     Pins 33,34 +-----+
                  \____|  Pins 33,34      X   X     no connection
  M103 L15             no connection     / \ / \
  Framework                             /   X   \              Lower Crate
  +-----+ Pin 1 end                    /   / \   \   Pin #1 end +-----+
  |Trm16|-----------------------------'   /   \   '-------------|Trm24|
  |     |        34 conductor            /     \                |ANSWs|
  |DONEs|        twist-flat      -------<       ----------------|Trm31|
  |     |                                '----------------------|Trm24|
  |Trm31|---------.-------------------.                         |DONEs|
  | N.C.|_______.  \                   \------------------------|Trm31|
  +-----+        \  \__                                         | N.C.|
                  \____|  Pins 33,34                Pins 33,34  +-----+
                         no connection             no connection


Verified what PAL's are now in the L15 Framework Term Receiver MTG.  PAL's
are installed in the Receiver MTG for L15 Terms 0:18.  All Veto Confirm
PAL's are installed for Spec Trig's 0:15.

Recall that we need more BIT2PAL's for the ERPB MTG.

We verified that Racks M105 and M106 were sending data to the DSPs.  We
did not do any super-serious data transfer checks like with M103 and M104.
We will need to do these checks before the L1.5 Cal Trig is a usable product.
We still haven't fixed the munged backplane problem with the top backplane
in Rack M106.  We just read "0" for EM and Total Et from eta = -5, -6 at
phi = 13, 14, and 16.  Phi 15 reads correctly.  Steve made Pallet files for
M105 and M106 but they haven't been tested.

5 May 1994
----------
We tried (but did not succeed) to do the full "dance" test during a beam
study period.  We had some problems with both DSP software (all related to
Steve's addition of the Type 4 Entry in the DeBug section) and the 68K
Service software (related to moving the "synch state" from D12 to D15).
These problems took quite some time to find and fix.

We then discovered that we had the Specific Trigger Fired and its Strobe
wired up to our patch panel upside-down.  This caused ERPB data to roll into
the DSP Comm Ports when it was not expected.  It appears that this can
confuse the "Sanity and Configuration Checker" which runs when the DSPs
are reset.

After fixing that, we verified that the DSP's and the 68K Services could
handshake correctly (i.e. the same test we did last week).  After proving
that functionality, we attached the Term Answer paddleboard and hooked
up the cables to the L1.5 Framework.  We tried to accept every event
(by forcing a large EM Et in a Trigger Tower outside of the current
eta coverage).  We discovered that we were exiting every L1.5 Decision
Cycle via Timeout.  This was because we were using the Specific Trigger
Fired Strobe to initiate L1.5 Cal Trig processing.  Of course this Strobe
doesn't fire until AFTER the end of the L1.5 Decision Cycle, so the L1.5
Cal Trig didn't start running until the L1.5 FW timed out.  To fix this
problem we decided to change the ERPB MTG setup somewhat.  We now use
the AND of Specific Trigger x (x = 0:15) and Start Digitize to feed MTG
Channel 6.  MTG Channel 6 feeds MTG channel 8, which outputs a pulse
one Bx long, starting at 82 of the Bx which made the Specific Trigger fire.
MTG Channel 8 feeds the /STORE and /LATCH "master" MTG Channels, and also
feeds Channel #7.  Channel #7 makes another 1 Bx pulse, starting at 75 of
the Bx following the Bx with the positive L1 decision.  Channel #7 feeds
the Transmit_Trigger "master" MTG Channel.  This scheme will work (with
replacing the Specific Trigger input by the >=1 Term Required signal from
the TSP2 Paddleboard, which will require actually programming the Term
Select paddleboard) until double buffering is required.

We needed to make a new ERPB MTG PROM, ERPBTG1B.DAT.  To get this file on
a floppy disk, we used the FTP program on the Macintosh (in the Telnet folder,
which is in the Communications folder).  We then used Apple File Exchange on
this Mac to write an MS-DOS compatible disk to take to the Data-I/O Unisite
Model 48.  Theoretically we could have used the PC (Dan claims to have
done this) but the PC seemed to be munged.

When we pulled out the ERPB MTG we noticed that it has no PROM or PALs for
Timing Signals 25:32 (i.e. high eta).  Before we are able to use |eta|
17:20 we will need to install a PROM and appropriate PALs in this MTG.

The beam returned while we were still getting ready for another attempt at
the "always pass" test.  We left the power turned off in the CRC/MTG crate
and in the L1.5 Cal Trig VME Backplane.

6-May-1994
----------
We powered up the L1.5 Cal Trig VME Crate again.  Powering up this
crate, and getting code correctly loaded into all 12 DSPs, still appears
to be a delicate operation with some not-understood complications.  Some
problems that we see are:

    Not all 12 DSPs successfully complete their "Sanity and Configuration
    Checker".  This is apparent either by watching the Shared Dual Port
    Access LEDs (which you can see by reflection) and noticing that the
    correct "pattern" does not appear, or looking at the program counter
    of each DSP.  The end of the Sanity and Configuration checker should
    be
        address     BU  address     where address is 2ff931 or thereabouts.
    When this happens, the associated C40s do not always operate correctly
    (for example, they may have "junk" in some Comm Port Input or Output
    FIFOs), or may not correctly respond to interrupts.

    Does the VME RESET signal act differently from the RESET button on
    the DSPs, and do both act differently from resetting the DSPs via
    Boot Control Registers (which TCC must do)?  The "software" reset
    available via the debugger is absolutely NOT equivalent to a hardware
    reset (for example it does not provoke the Sanity and Configuration
    Checker).

What can we do to improve the robustness of the start-up sequence?  Also,
what about robustness during "normal cycling"?  In all of our tests, once
we have gotten over the "hump" (which may take a few cycles of "normal
cycling" followed by resets or new code loads) we then appear to be moving
smoothly.  Are we just on the ragged edge of something?

We did the "always pass" with no difficulty.  The "always fail" test worked
after we removed a line which had been added to the 68K Service code for
testing.  We appeared to exit a small fraction of cycles via Time Out
(more on this in 2 paragraphs).

We then tried the "dance" test, but letting the 68K generate Term Answers
based on the TAS# rather than the DSPs determination of Term Answers.
We had a design problem in 68K Services which did not correctly handle
the "fail" case (it set the "DSP Data Available" flag when it should not
have done so).  This design problem was fixed and the "dance" test then
succeeded (again with a small fraction of "exit via Time Out").

We then tried to Mark and Force Pass one out of every 3 events, as well
as "dancing."  Under these conditions we saw 33.33% "Exit L1.5 by Time Out".
The problem was (is) that 68K Services always waits for GDSP to arrive
at D3 (and provide Term Answers) before sending the Term Answers to the
L1.5 Framework.  This has 2 problems:  (a) L1.5 Framework times out
at 250 usec, while the GDSP takes more than 350 usec to arrive at state
D3 during MFP Events, and (b) the 68K does not actually try to "force
pass" the event, but rather just sends the GDSP's Term Answers to the
L1.5 Framework.  This may be the same problem that caused us to have a
small fraction of "exit via Time Out" when we tried to Mark and Force
Pass a small fraction of the events.  We DID NOT fix this problem in the
68K Services code.  Note that a Time Out forces the event to be accepted,
so we do actually pass all of the Mark and Force Pass events.  We may try
to return Term Answers and Dones to the L1.5 Framework at random times,
but under the special running conditions we used, the "reset DONEs" logic
in the Term Answer card masked the DONEs.

Running under the above conditions, we put 59 events on disk (with help
from the Guidas) in the file:

    DATA3:[CAL]CALOR$1_078267_01.X_ZRD01

(available only at Fermi, NOT at MSU).

We could probably put events on disk by ourselves under TAKER, but they
need to land in a directory which is accessable both to COOR and to whatever
account used to run the TAKER.  Use "Turn Recording On" and "Data Disposition"
to shoot data at a disk.

Steve looked at these events a little bit under FZBASCII (do SETUP UTILS,
choose "7" [SETUP_UTIL.COM], and then type FZBASCII.  Some hints on using
FZBASCII:

    FZopen  gets a file (for example the above file)
    NExt    skips to the next record in the file (beginning of data
            counts as a record)
    FInd    Finds records which meet certain criteria
    DBank   is the moral equivalent of ZBD, but has an even less-friendly
            user interface (!)

        Once in DBank, there is no "help" facility.  Some hints for DBank:

            <down>,<up>     scroll down or up in the data block
            <number><CR>    go to longword <number> (in decimal)
            X<CR>           set output format to hex
            <CR>            exit DBank, return to main menu

        Note that any commands you type DO NOT appear on the DBank screen.

These events looked completely correct.  There is a mix of MFP and normal
events.  The first 4 events are:

    Event #         Comment
    -------         -------
    1863            Mark and Force Pass
    1908            Mark and Force Pass
    1924            "normal"
    1940            "normal"


Steve sent mail to Dan Owen, Djoko, and Greg Snow telling them to look
at this data and carefully examine it.

I picked up from Greg Cisco a 2nd master VI and a second slave VI.  Both
of these came back to MSU for testing and use in the TCC to L15CT crate link.
The 1st VI master and slave that I picked up from Greg a couple of weeks ago
remain in use in the L15CT crate bus to bus link.
..............................................................................

Date: 27,28,29-APR-1994   At: Fermi   Topics: Test L1.5 Cal Trig with
                                              real 68K Services code,
                                              sending data to L2 via VBD.
27-APR
------
We brought the PC back to FNAL to continue testing of the L1.5 Cal Trig.
We were able to debug all 12 DSPs in the JTAG scan path (which we had
also previously been able to do at MSU).  We loaded the correct code
into all 11 Local DSPs and the Global DSPs, and also made the correct
A-to-B and C-to-B Local-to-Global connections.  This had never been done
at MSU.  Using Steve's "baby" DSP control program we were able to control
all 12 DSPs and move data into a MVME214 module (without reading ERPB
data).

We ran all 12 DSPs under the "baby" control program, running as fast as
we could (not writing data to 214's, and also without reading ERPB data).
At top speed, each complete cycle took about 117 microseconds.

We have a problem with DSP A2.  It does not respond to its NMI (which is
used by TCC to tell the DSP to load its Parameters).  It does not respond
to the push-button or to the "VME" NMI (recall that the NMI source is
selected by the Hydra Interrupt Control Register).  All other DSPs on
Hydra-A respond to this interrupt correctly.  Hydra-B and Hydra-C also
have no problems with NMI processing.  Either this DSP is broken (note that
this Hydra is MSU S/N-1 which at one time correctly responded to NMI), or
we are doing something wrong with this Hydra card.  Note that Hydra-A is
the Hydra which is NOT on VSB, and DSP #2 is the DSP with VSB access.

We then "meshed in" the real 68K Services program (written by Dan).  After
fixing a few communication mismatch problems we were able to cycle the
DSPs under control of the real 68K Services program.  We were able to
pass events up to Level 2 (not during global running!) but did not look
at the contents of any events.  We also noticed that both 214's were
being accessed in a single event (i.e. both 214 LEDs lit up more or
less at once).  This indicates some problem.

Installed the Data Cable to connect the L15CT crate.  The path is now the
following:  from the Sequences to a Repeater in M114, from this repeater's
output to the L15CT crate in M124,  from L15CT crate to another Repeater in
M114,  from the output of this 2nd Repeater to the 3 VBD's in M114 and M115,
then up to the third floor.  Note that all three cables (64 TF, 26 TF and
RG58U) in the Data Cable all follow this path.  No new Data Cable problems
appear to have been started by this 60 foot addition to the Trgr Data Cable.

Note about the VBD in ByPass mode:  Even in ByPass mode the VBD is not
completely save.  If the VBD is in ByPass mode and you tell it to do random
things it CAN put data onto the Data Cable.  Only switch to ByPass mode (or
back from ByPass mode) when data is NOT flowing.  When in ByPass mode and you
want to work in the VME crate (in a way that might talk to the VBD), UNPLUG
the 64 TF and the 26 TF from the VBD.  Do NOT unplug the RG58U token cable.

28-APR
------
Since we don't have a TCC hooked up to VME, we made a better version of
the "generate NMIs to all 12 DSPs" program.  This one tries to verify that
all DSPs have correctly loaded their parameters.  Not having the TCC around
has the advantage of removing a complex element from the system.  The PC
can load the code, 68K can provide the interrupts, is there anything else
we can do to make TCC-less life easy?  There is a file in the
[D0_Text.Level_15.CalTrig.Hardware_Software_Text] directory about how to
start up the DSPs without the TCC.

We also added some switches that let us either let the 68K look at Path
Select P2PB to choose "Innocent By Stander" vs. "That's Me" processing, or
force one or the other.

We ran under the same conditions as yesterday (real 68K Services, no ERPB
data, not Global running so we had our own Specific Trigger) to work on
the "both 214's hit on one event" problem.  We were able to get a ZBDUMP
of some events.  They looked about right (with a few small problems), but
we realized that we have no way to demonstrate that the Data Block is
all from the same event!  We revived the old "pass the TAS Number from
68K to DSPs in the Wakeup Word" idea (it actually never left the DSPs...)
and for now just stuck the Wakeup Word into the Global DSP Object List,
which is near the end of a "normal" Data Block (but not near the end of
a Mark and Force Pass Data Block).  We should define a new entry type
for the DeBug section which contains some synch information.  This entry
type should always appear in the DeBug section.  It should appear at
the end of the block.  What else should we stick into the Data Block?
L1 has lots of Data Block consistency checks, why didn't we build any
into L1.5 from the start?

After sticking the TAS Number into the Data Block it was clear that the
Load Buffer was being changed while the Global DSP was still loading it.
We found the problem in 68K services and made a temporary fix.  That
solved the "both 214's hit" problem.

We played with the mix of Mark and Force Pass vs. "normal" events.
We have some evidence that the VBD reads out the "full" DeBug section
even on events which didn't receive Mark and Force Pass processing.  We
need to look more carefully at this.

We also see a VBD problem.  Whenever we change 68K Services code, we
pause the run, abort the 68K, reload the 68K, restart the Services code,
and resume the run.  After resuming the run, we see lots of crate token
errors on our data cable (which would seem to indicate that the VBD was
not being read out), but we see events flowing through the VBD.  Hitting
"reset" on the VBD clears the problem.  We should be very careful about
playing with the L1.5 Cal Trig Service 68K when events are flowing through
Level 1 (which is on the same data cable).

We then let the DSPs receive data from a subset of the ERPBs (|eta| 1..4).
We set up the MTG to control the ERPBs exactly the same as the first
Data Transfer test done a month ago.  We were able to see the calorimeter
noise in the Mark and Force Pass data for the DSPs which were hooked up
to L1.  We also proved that the DSPs needed to see ERPB data by shutting
off ERPB transfer and watching the system hang.

Finally, we did some speed tests of the whole system.  Here are the
results:

(1) Readout Required, That's Me processing, no ERPB data

Trigger     Geo Sect    L2
Rate (Hz)   5 FE Bz     Disbl   Prescale    Comment
---------   --------    -----   --------    -------
  5.7       0.1%                50000
 11.5       0.1%                25000
 22.9       0.3%                12500
 38.2       0.5%                 7500
 72.0       0.9%                 4000
 94.0       1.2%        ~40%     2000       rate limited by L2, but
 93.0       1.2%        ~60%     1500       rate vs. FE Busy is realistic
 94.0       1.2%        ~75%      900

(2) Readout Required, Innocent By Stander, no ERPB data

Trigger     Geo Sect    L2
Rate (Hz)   5 FE Bz     Disbl   Prescale    Comment
---------   --------    -----   --------    -------
 85         0.1%                 2000

(3) Readout Required, That's Me, including ERPB Data

Trigger     Geo Sect    L2
Rate (Hz)   5 FE Bz     Disbl   Prescale    Comment
---------   --------    -----   --------    -------
  5.7       0.1%                50000
 57.0       0.7%                 5000
 94         1.1%                 2500
 94         1.1%        ~45%     1750       rate L2 limited again, but
 89         1.1%        ~100%     900       rate vs. FEBz realistic

What we have not done is to look at any Readout Not Required rates.  This
will happen a lot in the real system so we should look at these rates also.

29-APR
------

We swapped in the new CRC card.  When Steve turned the L1.5 Cal Trig Power
Pan on with the CRC plugged in and fully hooked up to the DC's and the
Hydra's, the AC fuses for both the -5.2V and -4.5V supplies blew.  Steve
replaced the fuses (the dead fuses were 1.5A, but all I could find for
replacements were 2.25A).  After fuse replacement things looked fine.  The
new CRC appears to work.

Dan made some "bug fix" changes in 68K Services code.  The current version
has no known problems.

We captured some good ZBDumps of the TRGR bank (including both L1 and L1.5
Cal Trig).  They are stored in the VWORK1 directory with names containing
the string "29_APR".  We got "Innocent By Stander," "That's Me Normal," and
"That's Me Mark and Force Pass" events.  We could compare the L1 data to
the L1.5 data in the Mark and Force Pass events.

We still have not tested any "Dump Event" processing.

Here are some reminders about running TAKER and ZBD:

To set up taker:  D0SETUP   TAKER

To run taker:     TAKER/FU

L1.5 Cal Trig test trigger definition is under:  CAL  --->  CAL_TRIG_L15

To setup ZBD:     D0SETUP   ZBDUMP     (not ZBD!)

To run ZBD:       ZBD

The lengths of the crates in the TRGR bank are:

Level 1:               2847 longwords

Level 1.5 Cal Trig:     426 longwords (without Mark Force Pass data)
                       3396 longwords (with Mark Force Pass data)

The maximum length of the TRGR bank is less than 6300 longwords.  If you
select an ending address greater than the TRGR bank length, ZBD collects
data to the end of the bank.

When L1.5 Cal Trig is not doing Mark Force Pass processing, it is first in
the TRGR bank.  Usually, when L1.5 Cal Trig does Mark Force Pass processing,
the Level 1 data is first in the TRGR bank.

We have still not solved the "VBD reset required" problem.  It never happens
if we just pause our run, but frequently happens if we pause the run and
abort the 68K (for example to load a new version of 68K services) and then
start execution of 68K services at the normal $95000 entry point.

We also still have not solved the "A2 ignores NMI" problem.
..............................................................................

Date: 21,22,23-APR-1994   At: Fermi   Topics: Install JTAG, and more L15CT
                                             modules and jumpers, first tests
                                             of 68k_Services and known problems
                                             known Vertical Interconnect
                                             problems.

Installed the JTAG pod and the JTAG wire wrap board in the upper L15CT
VME crate.  These are installed on the piece of metal that coves the P3
backplane location.

Installed the "A" and "C" Hydra-II cards.  Now the Hydra-II's installed
at Fermi are the following:     "A"  is Ariel SN# 7010  MSU SN# 1
                                "B"  is Ariel SN# 7052  MSU SN# 3
                                "C"  is Ariel SN# 7044  MSU SN# 2

Now all cards are installed except for the slave Vertical Interconnect
from the TCC.

Installed the proper (I hope) backplane "grant" jumpers to cover locations
where we do not have cards installed.

         VME  BG3 jumpers cover slots:  8, 18, 19
         VBS  BG  jumper covers slot:  11

Worked on typing in the 68k_Services source program and making initial
tests of it.  So far there are three known  problems-features:

1. Either this VBD is different from the L1 VBD or else it appears different
   to the 68k because of the VI between them.  What ever the cause it ends
   of the the Base Addresses do not load into the VBD correctly at $B800 if
   they are loaded as Longwords.  Note that L1 VTC code does load them as
   longwords!!

2. The VBD (or at least this VBD does not appear to go ahead and dump data
   on the floor (i.e. onto a non existent data cable)  if it is controlled
   via its hardware  SRDY and DONE lines.   The L1 VBD does dump data on
   the floor but it is controlled via VBD registers.

3. Dean knows of two problems with VI's  then they are the Slot 1 Crate
   Controler  AND  they are passing VME bus mastership back and forth
   with a different module.
   1) with a 68k cpu module things can hang. This was noticed in the HV
      racks.  The cause is a feature that was built into the VI for Goodwin
      front end software compatibility.  There are PAL's available to remove
      this feature.
   2) with VBD modules things can hang.  This was noticed in CD (or else
      muon cates).  It is not understood but is thought only to happen
      when the bus arbitration overlaps and VBD is master data transfer.
..............................................................................

Date: 20-APR-1994      At: MSU,   Topics:   M111 upper Tier1 power pan breaks.
                           Fermi            Fire Tech people doing something.

Power Pan Replacement at Fermi
------------------------------
Replaced the Power Pan in upper M111 (i.e. eta +17:+20 phi 1:16)
Removed Power Pan sn#  PDM-14 and replaced it with PDM-22.  It required right
about 1 hour to replace the Power Pan.  I found a good strong proper height
cart to use to hold the Power Pan as it was being installed.   I removed the
test data generator for ERPB Distributor Cap card at the same time.

On TCC's disk I renamed the special Trics_Init_Auxi.dat   to
Trics_Init_Auxi.20APR94_Only_dat.   I understand that this special file should
be deleted but I wanted to check on this before deleting it.

Returned the 68k VTC to running its normal program but did not delete the
special eta 16 program.

When the system was powered back up I had the normal ZRL induced TCC problem.
TCC was awake and taking commands but I do not think that it was really doing
very much with them.  It would take about twice as long as normal to Initialize
and then say BAD FAILURE.  I NCP Trigger Booted it.  This did not help.  I then
power cycle booted it (power cycle both TCC and its BA23 box).  After the
power cycle all was OK.

The alarm message says that the Power Pan -4.5 brick had failed at 11:12 AM
this morning.  They had run with beem from about 00:30 until 5:45 this
morning when a D0:: online disk locked up.  They fixed that by 11:15 AM
i.e. 3 minutes after the L1 Power Pan failure.  It took on the order of one
hour to get the eta |1:16| work around going.  They ran that way  until
about 21:20 this evening.  At that time I replaced the Power Pan while muon
people did some work and then things started back up for the tail end of the
store.

Note we were tripped off yesterday (morning at about 11 AM I think) by the
fire tech people doing something.  Today when I arrived the first thing that
I saw as I came in the front door was that our VESDA "thermometer" was up
about 3 or 4 notches.  The fire tech people were still around today.  There
were two of them in the 1st floor MCH when I arrived, two by the front door,
and two in the kitchen.  They say that right now the Trigger VESDA is jumpered
out.  As far as I and Maris know, no MSU people were ever contacted before
this work started nor were we ever told that the Trigger VESDA protection
had been jumpered out.

The get it running work around from MSU
---------------------------------------
The upper power pan in Rack M111 broke sometime between 5AM and noon
today.  At about 12:15 Jan called from the control room to report the
following:
            M111 upper pan (pan #2) -4.5V supply was making -3V
            according to the low voltage monitoring system.  Confirmed
            by Dean S measuring -3V at test points on pan.  He measured
            approximately 0V AC.  All 4 LEDs were still on.

            The triggering symptom they saw was "infinite Missing Pt"
            (I assume Dean meant that the Missing Pt triggers fired a
            lot).

Steve and Philippe had Dean turn off the 4 Tier 1 pans in Racks M111 and
M112, but leave the Tier 2 pan in M112 powered up.

We at first used TRICS to tell TCC that the Cal Trig eta coverage was
limited to eta +/- 1..16 thinking that this was the fastest way to get
back on the air.  We reminded them that they would have to modify all 16
reference set files, but they did not. No reference set download worked
because TCC rejects any message with trigger tower boundaries outside the
Cal Trig eta coverage  defined in TCC.  TCC does NOT try to program
the parts of the Reference Set which are in the defined coverage--instead
it returns a bad parameter acknowledgement message to COOR.

To save the Control Room people from having to edit all of their Ref Set
configuration files, we re-defined the Cal Trig eta coverage to +/- 1..20.
We overwrote the Tree Offsets to be correct for eta range +/- 1..16
(including changing the Px/Py offset from 3 to 2).

There is a new file in D0HTCC::[TRIGGER]TREE_OFFSET_ETA_16.DAT
that overwrites the EM, HD and TOT 1st and 2nd Lookups tree corrections with
the values for a coverage of eta = 1..16.
The correction of Px and Py are also matched for having onlyl two tier #2,
that is 2 counts.
However the HD 2nd lookup correction value is pre-L15CT HD PROM change and is
incorect.

There is a temporary TRICS_INIT_AUXI file >>on TCC only<< (not backed up)
that starts with
                ***** Disclaimer *****
        This is a temporary version overwriting the tree offsets to eta =16
          to remedy to the loss of a power pan in M111
        This version only lived on TCC on 20-APR-1994
        This version has not and doesn't need to be archived.
This file needs to be deleted once the eta = 17..20 eta coverage is back online.

We re-initialized and the Control Room then re-downloaded Level 1.  At that
point things looked about right.  There were 3200 errors when initializing
the Cal Trig, but all seemed to come from the caltrig at eta 17..20.

We then made a temporary version of the VTC program.  This temporary
version forces the correct Zero Response (8) in the ADC counts for
eta +/- 17..20 in the Data Block.  The temporary version is called
RUNME68020_ETA_16.ABS and is in the VWORK2: directory as well as in
TRGCUR.  We had someone from the Control Room load this temporary VTC
program into the VTC.

All of this work took about 1 hour, during most of which time Level 1
was not able to take data (although the rest of the experiment was).

Between 5AM and noon, there were cluster problems.  The broken power
supply was not discovered by the Control Room until after the cluster
problems were fixed.
..............................................................................

Date: 6-APR-1994       At: Fermi  Topics:      L15CT Installation Work
       through
      8-APR-1994

6-APR-1994
----------
Worked on L15CT installation.  Cut pins and shrouded all backplanes in
4 racks (M105:M108).  This required from 22:09 to 9:07,  i.e. it still
takes about 3.7 hours to cut pins and shroud a rack.

One significant problem was discovered in rack M106 top Tier 1 backplane.
Trigger Tower -6,14  Tier 1 Backplane SN# 7   TotEt #1 is OK   first 5 bits
of TotEt #2 are OK, bit 6 has a pin bent into bit 7, bits 8 and 9 are OK,
TotEt #3 and #4 are OK.  We should check the inventory book about this
backplane.  The bent over pins have teflon tubing over them.

7-APR-1994
----------
ERPB MTG to ERPB's in M107:M110.  This cable goes first to M110, and then
to M109, and then to M108, and finally to M107.  There are three sections
of cable between each rack.  There are 31 sections between the M124 ERPB
MTG and the connector in M107.  Note this is 4 sections shorter than the
run to M103 (i.e. 10 nsec shorter to match the Cal Trig Timing cables as
it should).

Install ERPBs, CTFE-to-ERPB cables, daisy-chain data cables, and parallel
timing cables in Racks M107:M110.  This work required from 21:56 to 4:55
with 2 people working.  So it takes about 3.5 man-hours to stack ERPBs and
cable a single rack.  This is on top of the 3.7 man-hours required to cut
pins.  This is about as fast as it will ever go, I think.

DCs are not yet installed in Racks M107:M110.

We turned the L1 and L1.5 FW (and M114) back on but left the L1 Cal Trig
turned off when we left (including T3).

8-APR-1994
----------

"Ohms-check" all power pans in Racks M103:M107 and also the T3 power pan in
Rack M108--everybody was fine.  We turned on the Cal Trig and there were no
power or smoke problems with either the old stuff or the ERPBs we installed.

We performed an INITIALIZE/RESTORE and checked TRGMON afterwards.  TRGMON
was geeked up, it showed big (but mostly or completely unchanging) global
energy and momentum sums and lots of Large Tiles above Reference Set).  We
didn't look at ADC counts.  All T1, T2, and T3 CAT LEDs showed lots of energy
also so this was not just a readout problem.  TRICs did not report any errors
with the INITIALIZE/RESTORE.

We tried to EXCLUDE all Trigger Towers.  TRICs again reported no errors but
TRGMON was still showing the same big global energy sums, etc.  We checked
CAT LEDs and they were still munged.  We still did not think to look at
ADC counts.

We tried a full INITIALIZE.  Again no errors from TRICs but no change in
TRGMON or CAT LEDs.

We checked the suspect Tier 1 Timing Signals but they all looked good.  We
then looked at ADC counts and saw random (but mostly large) values.

We tried to EXCLUDE a single EM Trigger Tower.  TRICs responded with BAD
PARAM.  We then rebooted the TCC without giving Philippe a chance to look
at anything.  This was stupid but rebooting the TCC did solve the immediate
problem.

This is now the second time that we have seen this "third" type of TCC
problem.  Both times have been associated with (long?) shutoffs of M114.
It sounds suspiciously Zeller-related.  The next time this happens we
need to let Philippe take a look.

We re-routed the AC cabling for the fans, water valve, and Norm Amos crate
in Rack M124.
..............................................................................

Date:  31-MAR-1994     At: Fermi       Topics:  L15CT Installation Work:
        through                                 Ironics, Vertical Interconnect,
        2-APR-1994                              Possible L1 Problems Dean Sees,
                                                Made a run of Find_DAC,  some
                                                Level 2 disable scalers look
                                                funny,  +14,-14 Ref Set Test
2-APR-1994

Made a run of Find_DAC,  some Level 2 disable scalers look funny

Yesterday they made another run for Kathy Streets for L1 calibration
The run only lasted about 10 minutes (most of which I was on the phone
and talking to people about the -14 vs +14 "problem").  Thus I only
was able to watch TrgMon for about the last one minute of this run. I
watched Global CT display and not the scalers.  About 1/2 of the events
that I saw were from -14 and the rest from the other 7 eta rings.

Between stores I setup a +14 only Ref set and a -14 only Ref set and tied them
to two Spec Trigs with a count threshold of 1.  I lowered the Ref Set to 1
GeV to get some rate.  Typically this was about 1 Hz with possibly the -14
being higher.  The accelerator then injected 6 small proton bunches and
the +14 rate when up to typically about 10 Hz.  The accelerator then did
something that made the D-Zero loss monitors go way up and the +14 rate when
into the hundreds and the -14 rate into the tens.

1-APR-1994
----------
I got a master and a slave Vertical Interconnect from Greg Cisko.  The address
switch on the Master was set to $20000000 and I moved things around until the
master woke up at $10000000.  I now think that the rocker closest to P1 is the
A31 key and that the lower order address lines work their way towards P2.  But
there is no documentation that explains this.  On the Slave Vertical
Interconnect I added jumpers  J10  and J15  to turn on all of its slot 1
controller functions.  I believe that the DIP switch on the Slave Vertical
Interconnect puts it at $4000 in Short I/O space but once again this is not
clear to me.

Tested the Vertical Interconnect, the VBD Buffer, the two 214's, the Ironics
control of the 214's via the Readout P2, and the VSB Bus.  This was done
just using the 135Bug MM, MD, and memory test routines.

                        Control Cables
                        to the MVME214
                        --------------
                        Slot13  Slot14        Slot #13           Slot #14
   RC Ironics Port 4    Cab #1  Cab #2     MVME-214 SN#11     MVME-214 SN#12
  ------------------    -----   ______     --------------     --------------
  write $10 read $EF      0V     +5V       no VME  VSB ok     VME ok  no VSB
  write $20 read $DF     +5V      0V       VME ok  no VSB     no VME  VSB ok


31-MAR-1994
-----------
Level 1.5 CT Work
-----------------
Installed the -5.2 power wiring and the inter-P2 cabling in the upper L15CT
crate.  Cut an opening the the rear lower air baffel to provide access to the
VBD P3 connector for the SRDY and DONE signals.  This VBD is now operating OK.
It control reg had a bad value for the crate ID address which it checks at
wake up time.  I moved the MVME135 to the Bus #2 section of the crate to check
out the VBD and the Short I/O memory.  The Short-214 that I'm using is the
official spare from here.  It was setup in some goofie way (i.e. decoded 8k
at $8000).  Thus it was NOT really a good spare card for here.  It is now
back to normal (i.e. 2k at $9000) and is working OK.

I brought three of the P2 Paddle Cards here and they are installed: Term
Select, Readout Control, and Path Select.  I had to cut the bottom rear panel
to allow room for and access to the inter-P2 cables.

I brought three Ironics cards here: SN#3, SN#4, and SN#6.  These will be used
to service the above 3 P2 Paddle Cards.  There was a spare Ironics card
here at Fermi (SN# none) which I also used.  This card was the one that had
operated at NWA.   I have plugged the Ironics cards into the crate.

I was concerned that without software to write "0's" into the bits that are
inputs that there could have been fighting lines (i.e. external driver is
high and the Ironics bit output driver is low.  The Ironics manual does not
say that the VME crate reset signal does anything.  I traced the SYSRESET
signal on an Ironics card and it in fact does clear all six of the National
8211 output driver chips.  So it is OK to plug the Ironics into there P2
cards even through there is no software yet.

The Ironics were installed as follows:

    Crate Slot    Serial Number    Base Address         Function
    ----------    -------------    ------------    ------------------
         2            SN# none        $F010          M103 Comm
         3            SN#3            $F020          Term Select
         5            SN#4            $F040          Readout Control
         6            SN#6            $F080          Path Select


Possible L1 Errors
------------------

Dean reported to me two possible L1 errors:

Now that muon has gotten rid of most of there problems, about once every 15
to 20 minutes of normal global physics running Dean sees crate $B "stale data"
message.  This means that when the token came around the data in the VBD had
an "older" 4 bits of sync information.  My best idea so far is wait until it
does this and then run into the VTC console and look for an error.  I could
also change the VTC program to NOT write the 1's and 0's so that error messages
would stay visible.

Near the end of this afternoons run Dean said that a couple of times Spec Trig
#22 went to a very high rate and then after perhaps 10 minutes it came back
down to a normal rate.  During the high rate time Trig Tower -9,22 showed
up in the Total Et Jet List all of the time.  The main readout does not show
energy in this Trig Tower region during the period of abnormal high rate.
..............................................................................

Date: 22-25 MAR 1994   At: Fermi  Topics: L1.5 CT installation and
                                          testing work
                                          Some random loops with errors

We installed some more of the L1.5 Cal Trig hardware at Fermilab.  Some of
this installation work is permanent, and some was undone and brought back
to MSU when we left.  The things that were installed were:

    Component                                       Still at FNAL?
    ---------                                       --------------
    DC's in Racks M103 and M104                     YES
        (including all necessary cabling)
    (DC S/N-P2 in M103, S/N-P3 in M104, these
    are DCs from the first Prototype manufacturing
    run)
    CRC S/N-1 (with only 2 Channels) in M124        YES
    ERPB MTG PROMs and Patch Panel                  YES
    Hydra Card "B-2"                                YES
        (including 8 cables to CRC)
    MVME135 CPU                                     YES
    MVME214 Memory Card                             YES
    Debugger Pod for Hydra                          YES
    IBM-PC to operate debugger                          NO

We spent most of this week performing data transfer tests for the L1.5
Cal Trig.  We wanted to demonstrate that we can reliably transfer data
from the 2 racks of CTFEs that are currently instrumented all the way
into the DSPs.  In this we were successful.  We have been able to perform
over 1,000,000 readout cycles (corresponding to 1GB of data transferred
from CTFEs to DSPs) with no errors that can be blamed on the data
transfer mechanism.

We first used the counter/switch board to inject data into a single DC, and
demonstrated that we could transfer data from this counter/switch board
into both CRC Channels (and then into the DSPs).  We ran the "real" L1.5
Cal Trig code on the DSPs and wrote Mark and Force Pass Data into the
MVME-214 memory module.  The Service 68K CPU provided the necessary EC/RC
services, and also checked the data in the MVME214.  The checking was done
by allowing the 68K to "learn" the data pattern on the first transfer, and
then requiring it to "check" the data pattern on every subsequent transfer.
It examined the entire Mark and Force Pass subsection (for DSP B1, B3, and
B4) of the Data Block.

We then allowed the CTFEs to provide the data for the DSPs.  This was done
using "Pallet" .DAT files written by Philippe.  These .DAT files paint
some pattern in all of the CTFEs in a given rack.  We again used the
"learn and check" control software to verify the transfers.  We also knew
how the TT Data in the DSPs should look (because we knew what we put in the
CTFEs) and checked by hand to verify that the data looked correct.

We have several "Pallet" files.  Each attempts to find a different failure
mode.  Each "Palette" file is named

    [TRIGGER]L15CT_PALLET_M10%_*.DAT

Where the '%' is either '3' or '4' (indicating Rack number), and the '*'
describes the data which should be in the EM TT Data section in the Local
DSPs. The Tot Et tries to have the same pattern as EM, but the intent
failed at |eta|= 2 and 3 because some of the values that need to be loaded
in the HD channel are close to 8 (but different) and fall within the low
energy cutoff. TOt Et at |eta|=2 is ahead by one count of the desired value
and |eta|=3 is ahead by two counts

The existing "Pallet" files are:

    '*'     Description
    ---     -----------
    80      Each EM Longword in DSP Memory = $80808080 \ exercise ability to
    7F      Each EM Longword in DSP Memory = $7F7F7F7F / drive all bits low/high
    55_AA   Each EM Longword in DSP Memory = $AA55AA55 (+Eta) \ (max switching)
                                             $55AA55AA (-Eta) | alternating bits
    AA_55   Each EM Longword in DSP Memory = $55AA55AA (+Eta) | in words and
                                             $AA55AA55 (-Eta) / in byte stream
    COORD   Each EM Longword matches eta/phi coords in hex (check readout order)


These files can find bits stuck low, bits stuck high, neighbor bits shorted
together, some classes of readout order problems, some ERPB problems (internal
Xilinx screw-ups), and also can maximally torture the data transfer by
requiring each data line to make a transition on every byte transfer.  All
of these files have been run and checked both by hand and by the "learn and
check" program.

These files can be cloned to test other racks by EVE-search/replacing the MBA.

We found no design problems with the data transfer mechanism but we
did find some problems with ERPB cards and CTFE cards.  What we found
was:

    (1) the top ERPB in Rack M104 (S/N-18) appeared to have a problem
        of repeating the TT Et in both Towers serviced by one Xilinx
        chip.  We replaced this ERPB with ERPB S/N-19 and the problem
        went away.

    (2) the CTFE servicing ETA = +1..+4, PHI = 26 appears to have bit of
        value 2 stuck "on" in the 9-bit Total Et output driver for ETA = +2.
        This CTFE has NOT been repaired or even looked at.  We will wait
        for the next convenient power-off opportunity.

Running some Random loops on 23-MAR detects some weird error after 10-40k
loops. One or a random subset of the 4 Global EM Tower Counts are off by
+1, -1, -2, +2 count, but this is never repeatable when redoing the same
loop. Also notice that there are some instances with combinations of -1 and
-2  or +1 and +2 among the 4 reference sets, but all ahead or all behind at
once.

Philippe used the Tree Browser to try and locate the problem. Let it run
until it fails, write down all the tree outputs, then make it redo the same
loop and chase down which output has moved, then start over one lower
level down in the tree. Chase it down to SIGN_ETA(NEG) MAGN_ETA(13:16)
PHI(1:8). The symptoms seemed strange, with more than one bit, or even one
word into the CHTCR being different. But the bits that are unstable seem to
come from the PHI(3) card. I could witness only two instances of this
before we had to relinquish the system. Also remember that we just turned
additional clocking of the CTFEs right before running these tests; this
might be what has pushed this CTFE over the edge.

Also, while investigating the above problem, Px started acting up in the
same way as 16-FEB entry.  This time Philippe tried to modify one thing at
a time to locate the source of the problem. Pushing on the Px card in the
crate at (eta,phi)=(-5:8,17:24) didn't help. Unplugging, inspecting and
replugging the cable from the Tier 1 cat2 did not help. But shoving the
Tier #2 Px cards made the problem go away again, even though no movement of
the card could be felt.
..............................................................................

Date: 22,23-MAR-1994   At: Fermi  Topics:  CTMTG PROMs, DCs installed
                                           Foreign Scalers/AND_OR Terms
                                           for Norm Amos

Installed DC's in M103 and M104.  Used the ERPB Test Data generator on the
DC in M104.  Used the 68k program that checks the data in the MVME214 based
on the low order byte of the first Tot Et longword.

Installed new PROM's in the Calorimeter Trigger MTG in PROM positions #1 and
#2.  PROM #1 moved from SN# 1L to 1M.   PROM #2 moved from SN# 2K to 2L.

Checked the serial numbers of the Tier 1 Backplanes at the high eta racks.
We need to check this again against the inventory log book for the backplanes
to see which ones have the new style short pins.

             M110    M111    M112
            ------  ------  ------
      Top    SN#17   SN#19   SN#22
   Bottom    SN#18   SN#20   SN#21


Renamed a AND-OR Input Term for Norm Amos.  And-Or Input Term number 120
was changed from  MR_CAL_LOW  to  CAL_RECOVERY.


Work on setting up new Foreign Scalers for Norm Amos.  The following table
shows the wiring between the Bagby M122 rack NIM to ECL module and the
Foreign Scalers.  The changes are indicated by a "*".


  NIM to ECL      Pair on
  Module Lemo      the 17
  Connector      Pair Cable        What signal is it.   Where does it go.
 -------------  -----------  -------------------------------------------------
           top       17      Reset BX Count into MR 29 cycle to  Foreign #4.
  2nd from top       16      L0 Fast Z Good to our scalers.
  3rd from top       15      Not connected to any of our stuff. Mod Ch in use.
  4th from top       14      Not connected to any of our stuff. Mod Ch in use.
  5th from top       13      Qty #3 to the per Bunch Luminosity Scalers.
  6th from top       12      MRBS_Loss signal to Foreign Scale #1 Gate A.
  7th from top       11      MicroBlank signal to Foreign Scaler #2 Gate A.
  8th from top       10      MRBS_Loss .or. uBlank to Foreign Scaler#29 Gate A.

* 9th from top        9      This is now a free Foreign Scaler.
                             Foreign Scaler #36 Gate A
                             DBSC Ch #1 in slot 11 CA=32
                             this was MR_Veto_Cal_Low

*10th from top        8      BX_Cnts_MR_Hi_or_uB_or_Mu_HV
                             BX_Counts_of_
                                MR_Veto_High_or_Micro_Blank_or_Muon_HV_Recovery
                             Foreign Scaler #35 Gate A
                             DBSC Ch #2 in slot 11 CA=32
                             this was MR_Veto_Muon_Low

*11th from top        7      BX_Cnts_MR_Hi_or_Low_or_Mu_HV
                             BX_Counts_of_
                                MR_Veto_High_or_MR_Veto_Low_or_Muon_HV_Recovery
                             Foreign Scaler #34 Gate A
                             DBSC Ch #3 in slot 11 CA=32
                             this was MR_Veto_Cal_High

*12th from top        6      BX_Cnts_of_MRBS_or_uB_or_Mu_HV
                             BX_Counts_of_
                                MRBS_Loss_or_Micro_Blank_or_Muon_HV_Recovery
                             Foreign Scaler #33 Gate A
                             DBSC Ch #4 in slot 11 CA=32
                             this was MR_Veto_Muon_High

 13th from top        5      BX_Counts_of_MR_Veto_Low
                             Foreign Scaler #32 Gate A
                             DBSC Ch #1 in slot 12 CA=35

 14th from top        4      BX_Counts_of_MR_Veto_High
                             Foreign Scaler #31 Gate A
                             DBSC Ch #2 in slot 12 CA=35

*15th from top        3      BX_Cnts_of_MR_Veto_High_or_Low
                             Foreign Scaler #30 Gate A
                             DBSC Ch #3 in slot 12 CA=35

 16th from top        2      NC


Recall that the Lemo on the Module to pair number on the 17 pair twist and
flat is the following:  Top Lemo is pair #17, the bottom Lemo is pair #2 and
pair #1 is not used.

  The proper (I hope) edits have been made to  TrgCur:Trics_Boot_Auxi.dat
  to get this new Foreign Scaler information put into the Begin Run End Run
  Pause Run Luminosity files.   Put the old version of Trics_Boot_Auxi.dat
  in [TrgCur.Obsolete].  The new version of Trics_Boot_Auxi.dat is in D0::
  TrgCur:,  D0HTCC::[Trigger],  and MSUHEP::TrgCur:[DZero].

  The proper (I hope) edits have been made to  HTrgMon:TrgMon_FS.RCP  to
  include these new Foreign Scalers in the TrgMon Display.  This file was then
  copied to MSUHEP::HTrgMon:  and to  D0::User1:[Trguser.TrgMon].

  I edited the [D0_Text.Scalers]Scaler_Assignments.Txt to show the new use
  of these scalers.
..............................................................................

Date: 17-21-MAR-1994   At: MSU    Topics: Counter Switchboard - DC - CRC - DSP
                                          Timing
                                          DC setup

Standardized how to set up the DC's which will be used at FNAL.  The
switches should be set as follows:

    H2:  This jumper block chooses the delay between the data transition
         and the rising edge of the Strobe to the CRC.  Currently we would
         like to have 10 ns of set-up time between the data transition and
         the rising edge of this strobe.  Set H2 as follows:

            (backplane side of DC)

                  o   o
                  o   o
                  o===o  <- set the jumper at the 3rd position from the
                  o   o     backplane end of H2
                  o   o
                  o   o
                  o   o
                  o   o
                  o   o


    H3 and H4:  These jumper blocks allow spare signals to be sent to or
                received from the CRC card.  For now we are not using any
                spare signals to/from the CRC so these jumper blocks should
                be left completely unused (no jumpers or wires installed)


    SW1:  Mode/ID Switch

            Positions 1 through 4 of this switch set the ID of this
            DC.  Position 1 is the MSB, Position 4 is the LSB.  These
            switches indicate which Rack is being serviced by this DC:

                ID      Rack Number
                --      -----------
                 0      M103
                 1      M104
                 .
                 .
                10      M112

            Positions 5 through 7 of this switch are not used and should
            be set to the DOWN position.

            Position 8 of this switch selects whether the MTG or the
            Serial Configuration PROM (SCP) is the source of the Xilinx
            configuration.  It should be set to the DOWN position (selecting
            the SCP as the Xilinx configuration source).


    SW2:  SW2 is used to select various "bells and whistles" of the DC.

            Positions 1 through 7 of this switch should all be OFF (down)
            for now.

            Position 8 of this switch should be DOWN for POSITIVE ETA, and
            UP for NEGATIVE ETA.  This switch value is sent to the ERPBs
            and they use it to select readout order.


Dan and Steve spent several hours working with the "Dan Counter" to DC to
CRC to DSP data transfer.  This is a summary of what we have learned:

    (1) The following files are stored on the L1.5 Cal Trig Disk for the
        Logic Analyzer:

            C4PSS_1.M20         all 100 ns byte period, 10-ns delay (i.e. H2
            C4PSS_2.M20         set on 3rd position), 40-ns DC_STROBE (i.e.
            C4PSS_3C.M20        the 2nd version of the DIST4 GAL [dist4v01]).
                                The one labelled 3C shows every 4th /CRDY
                                (corresponding to the 1st longword in a
                                transfer following a longword transfer)
                                "stretched".  These were taken with only
                                one DSP running all 4 Comm Ports.


            C8PSS_1C.M20        all 100 ns byte period, 10-ns delay (i.e. H2
            C8PSS_2.M20         set on 3rd position), 40-ns DC_STROBE (i.e.
            C8PSS_3C.M20        the 2nd version of the DIST4 GAL [dist4v01]).
            C8PSS_4W.M20        The ones labelled C show every 4th /CRDY
                                (corresponding to the 1st longword in a
                                transfer following a longword transfer)
                                "stretched".  The one labelled W shows every
                                4th /CRDY stretched longer than typically seen
                                on the "C" captures.  These were taken with
                                2 DSPs each running all 4 Comm Ports.  All
                                traces are from ONLY ONE DSP.


    (2) There are 3 classes of timings that are seen.  These classes are:

            (a) a "normal" byte transfer (i.e. not the 1st byte after
                a complete longword transfer).

            (b) a "C-capture" version of the 1st byte after a complete
                longword transfer

            (c) a "W-capture" version of the 1st byte after a complete
                longword transfer.


    (3) The critical timings we have seen are (note:  all events
        are actually recorded at the CRC end unless otherwise noted.
        All recordings at the CRC have been "de-skewed" with respect
        to each other [done by measuring at similar points on the CRC
        path, NOT by calculating skew and subtracting from measurements]).

(a) For a "normal" byte transfer:
..............................................................................

Date: 9,10-MAR-1994   At: D-Zero  Topics: Install ERPB's in M103 and M104,
                                      Install cables  to M124,  Repair L1
                                      Trigger Tower +5,22 EM, Power Cycle boot.

Installed ERPB's in the bottom half of M103 and in M104.  It takes about 2 1/4
hours to clip and shroud a rack and another 2 hours to stack and cable the
ERPB's.  The ERPB's that are installed are serial numbers (top to bottom of
rack):
    M103:  6, 5, 2, 1, 10, 9, 7, 8    M104:  18, 17, 16, 15, 14, 13, 12, 11

Work on making cables to go between M124 and the L1 Cal Trig racks.  Measure
some lengths:   From inside M124 to the M105 top entry clamp is  14 sections.
                From M105 top entry clamp to D.C. in M103 is 6 sections.
                From inside M124 to D.C. in M112 is 27 sections min.

Maked the M103 to M124   D.C. --> CRC Cable 20 sections long.
Maked the M104 to M124   D.C. --> CRC Cable 19 sections long.
Maded the M124 to M103:M106   MTG --> D.C. Cable 35 sections to the first
  D.C. connector (which is in M103), then three sections to the next D.C
  connector, and then three more sections to the next D.C connector and finally
  three more sections (for a total of 47 sections to the D.C. connector in M106.

Note that the TSS Bus cables (and the CBus cables) have three sections between
racks in the L1 Cal Trig.

An Owen Pulser Run showed that  +5,22 EM  was too big by a factor of 2.  The
problem was a cold solder joint on the ground leg of R5 in the Term-Attn.

When I powered up the L1 system on Thursday morning after working Wednesday
night installing ERPB's, the system would not initialize properly.  TCC was
running OK.  TRICS software was OK, i.e. it would take my Initialize All
command and then come back with a Bad Failure after about one minute (i.e.
about twice as long as normal).  The lights on the BBB cards in M114 would
flash during the initialize so something was happening.  The TRICS log file
had many messages of the following:
   Assistant CBus Not Immediately Released
I ended up panicing and power cycle booting the TCC  i.e. power cycling both
the 4000 itself and its BA23 box.  After doing this the L1 system initialized
all OK.  After talking with Philippe we suspect that one of the DRV11J cards
may have become confused (because L1 power was off) and needed to be reloaded.
If I had power cycled only the BA23 box then the TRICS software running in the
4000 would have automatically reloaded the DRV11J's when power was restored
to the BA23 box.  There was no indication of any problem in the 4000 box or
in the running TRICS software.
..............................................................................

Date:  3,4-MAR-1994    At: D-Zero  Topics:  Bring L15CT VME crates and CRC
                                   crate to D0 and install them,  Bring MTG
                                   for L15CT to D0 and install it,  Cables from
                                   M124 to M114,  Install some ERPB in the L1
                                   racks.

Bring MTG card SN#24 to Fermi for use as the CRC_MTG.  Bring MTG card SN#27 to
Fermi as a spare.  SN#27 has the ECO for the global external signal input.

Install the two L15CT crates in M124.  L15CT crste SN#1 is on top and SN#2 is
on the bottom.  Also install the CRC Crate, the radiators, chassis supports,
and front panels.  To make room the IBM token ring stuff in the back of the
rack was moved up so that it is behind the Shea modules.  To make room at the
bottom back of the rack, the two LAr monitor HV fanout modules were removed
from their card file, the card file pulled out, and the modules put on the
floor of the rack behind Norm Amos's NIM bin.  Norm Amos still has his Active
MR Veto NIM Bin in the bottom of M124 screwed in from the back.

I strung the CBus cable (Assistant COMINT CBus #3) from M114 to M124 along
with a 34 conductor cable for control signals.  These cables are layed up
against the north vertical wall of the cable tray that runs along the back
of the north aisle of racks.  Over this I put another 64 twist and flat to
act as a shield between our stuff and the rest of the stuff in this cable
tray.

The L15CT Power Pan is on top of the air conditioner but not yet tied
down.  A tie down platform needs to be made for this and the VT terminal
that will be used for the L15CT 68k.

The CRC_MTG is powered up and running.  It should be addressed as CBus #3,
BBA 88,  MBA 89,  CA 35.   It receives its Clock and Once per Turn Marker
from the monitor output on the L1 Framework Main Timing MTG.  These signals
are carried from M114 to M124 over the 34 conductor cable.   See the file
in TrgL15CT:[Hardware_Software_Text] for more details about CRC_MTG.

Signals carried on the 17 pair control cable between M114 and M124.

    Pair               Function
   -------     --------------------------------------------
       1        Not used, shield
       2        Once per Turn Marker to the CRC_MTG
                  "MTG PROM Address Counter Reset"
       3        Not used, shield
       4        Clock to the CRC_MTG
       5        Not used, shield
       6:17     Not yet assigned

Friday afternoon the online cluster crashed so I got a chance to install some
ERPB cards.  4 ERPB's are installed in the range  eta +1:+4  phi 1:16.  They
look fine.  The daisy chain cables are installed but no parallel timing cable
yet.  The ERPB's appear to support themselves OK.  The parallel timing cable
could be used to give so support.   Where does the DC plug in,  i.e. to just
2 connectors or to all three?  The yellow and red LED's are ON and the green
LED is OFF.  I estimate about 2 hours per rack to install the ERPB's plus
extra time for the DC and the cables back to rack M124.

I need to make a 1U patch panel for the L15CT.  I think that it can replace
the 1U air flow panel just below the CRC Crate.  Need to make a holder for
the top of the 1st floor MCH air conditioner to hold the L15CT Power Pan and
the VT terminal for the L15CT.  I think that the top of the air conditioner is
31" wide (north-south direction) and with the Power Pan up there, there is
13" left for the VT Terminal, which should be just enough.
..............................................................................

Date: 16,17,18-FEB-1994   At: Fermi  Topics:  Check the M109 Tier 2 CTMBD,
                                   Replace two CTFE PROM's,  Test with Cal
                                   Random program and over night test runs,
                                   Cook two MVME-135 U86 PAL's,  Finish the
                                   Two COMINT ECO to COMINT SN#08,  Measure
                                   the current space for L15CT,  Installed
                                   more drip Detector strips and connected
                                   the RMI to our RPSS, G10 shroud for bottom
                                   M114 radiator, Description about
                                   modification to a CHTCR to readout the
                                   Beam Crossing Number,

    Check missing Large Tile supposedly fixed with CTMBD swap last week:
>> The problem isn't fixed<<. Things are "less bad": only one tower and one
refset and bad only about 10% of the time.
Exclude all EM and HD towers, and set all large tile reference sets to 0 GeV.
Then use TRGMON's "spy window" to display in Hex mode at item 4945. All large
tiles should appear above threshold for all refsets and show as FF. But item
#4947 still reads as FB about 10% of the time. This is Large tile +9:12,17:24
for large tile refset #0.

    Replace 2 PROMs (HD PROM at +17,12) and (Px PROM at +3,15), then run the
lookup PROM test with success on just these proms.  Philippe found problems
with these two PROM's a couple of weeks ago when he was checking all PROM's.
See   13-JAN-1994.

    Run Lookup test on all PROMs (test now handles Tier#1 truncation and
negative numbers) (17-FEB 22:23...18-FEB 03:04)
One error detected HD low by 4 counts NEG,E_1,P_1 page #2 EM 255 & HD 165.
Doing the same loop again repeated the error. Note that this is page #2 and
that the HD count is off by 4 counts, like in the random test errors below.
Run Prom test on this prom only (just HD, then all 4 proms), no error.
Redo prom test on same page the next day, and there were no errors.

    New problem detected while trying to run random test. Px low by 8 counts,
even before the first loop is run. Using Tree browser, locate the problem
coming from cell -5:8,17:24. The output of the CTFE all read ok, but the input
to the tier #2 is low by 8 counts. The lights on the Tier #1 card were
displaying the corect number. Philippe pushed on the front connector of the
tier #1 Px card, and pushed hard on the Px cards of Tier #2. Nothing seemed
incorrectly seated, but the problem went away.

    Start Random test overnight. When they initialized the trigger in the
morning, caltrig had done 3,382,163 loops. But, because of operator stupidity,
the test ran on positive etas only. The details of all random tests run:

   162,000  16-FEB  23:37 - 00:02   no error until operator stop
 3,382,163  17-FEB  00:05 - 08:42   no error until init (but pos_e only)
    87,955  17-FEB  11:28 - 11:42   HD 2nd lookup Sum is low by 4 counts
                                    error not systematic re-doing loop
                                    this was on lookup page #2.
 1,060,237  17-FEB  14:12 - 16:58   HD 1st lookup low by 4 counts
                                    error systematic re-doing loop
                                    this was on lookup page #2.
   125,000  18-FEB  10:31 - 10:11   no error until INITIALIZE
    74,000  18-FEB  10:46 - 10:58   no error until INITIALIZE
   529,000  18-FEB  10:59 - 12:23   no error until INITIALIZE
    98,000  18-FEB  12:57 - 13:13   no error until download
----------
 5,518,355 loops

    Neither one of the HD errors in random test were properly traced, because
of operator stupidity while doing two things at once, and weakness in tree
browser that needs to be fixed.

Cook two U86 PAL's for the MVME-135 CPU cards.  These are original parts to
replace a PAL that I damaged a couple of weeks ago.

Add the Two COMINT ECO to COMINT SN# 08.  This is the card that had been
running at D-Zero up until the time that we installed the Two COMINT setup.
A lot of work was required to remove and rework other "white wires" that had
been added in earlier ECO's because of the way that these wires were routed
and glued down.

Measured the current amount of space available for the L15 CAl Trig.  It is
about 53 3/4".  This is about 30.7 U.  It is integer U at the top and
fractional U at the bottom where it meets Norm Amos's Main Ring Veto stuff.

   Each of the two L15 Cal Trig VME crates are 9U (i.e. 15 3/4") plus a 1U
   (i.e. 1 3/4") fan tray.  Thus both L15 Cal Trig VME crates plus their fan
   trays take up 20U total (i.e. 35").  This leaves about 18 3/4"  (about
   10.7U) for the following:  Card file for CRC, CRC power supplies, vertical
   air in and out of the top L15CT crate,  vertical air in and out of the
   bottom L15CT carte.

RMI Drip Detector ( 17-FEB-1994 23:08 - 18-FEB-1994 01:18 )
We installed Drip Detector strips for the 8 pack radiators in racks M100
and M113.  Because there have been no false trips from the Drip Detector
in the past week we went ahead and connected the Drip Detector so that it
would trip off all the L1 power.

To do this we installed the RMI to RPSS Box.  This box is located in the
bottom of rack M113 behind the RPSS.  It connects to the RMI in the top of
M114 via a long green RG58 BNC cable.  This box is plugged into the M114
slot of the RPSS.  If the RMI detects a water leak the RPSS will trip showing
"Air Flow" and "Water Pressure" faults in M114.

The RMI output is also connected to one of the spare rack voltage monitor
channels and Dan Owen has an alarm set on this channel.  This channel is
labeled  LV1FW_M114_4.DRIP.  We did a number of tests to prove that the RMI
Drip Detector would trip the RPSS and that the "voltage" read out from the
RMI output was close enough to fit the 0.5 Volt tolerance that Dan Owen has
set on the alarm for this channel.

I need to make a written description of the RMI to RPSS connection and the
operation of this setup.  I will put this in the TrgHard:[RPSS] directory.
I will also move all of the safety system related documents from my private
directories to the TrgHard:[RPSS] directory.

Philippe installed a G10 shroud around the lower radiator in M114 to protect
the backplanes, cards, and power supplies from a leak at the hose connection
end of this radiator.  There still is no shrouding around the upper two
radiators in M114.  Leaks from these two radiators would be very damaging.

Started writing a file in  TrgHard:[CHTCR]  to describe the modifications
to the CHTCR card that are required to read out the Beam Crossing Number.
..............................................................................

Date: 10,11,12-FEB-1994  At: D-Zero  Topics:  Bring spare cards to D-Zero Hall,
                                     Swap the CTMBD in M109 Tier 2, Setup two
                                     new Foreign Scalers for Norm Amos, Measure
                                     Muon to Us cables,  Install Drip Strips
                                     and an RMI,  Pull out old TCC,  Cook CHTCR
                                     C2R1 PROM's

Bring 3 CTMBD cards to D0.  CTMBD  SN# 35 wired for Tier 1,   CTMBD  SN# 08
wired for Tier 2,   and CTMBD  SN# 17 wired for Tier 2.
Return the BBB SN# 09 to D0 Hall.  A couple of weeks ago this card was pulled
but the problem ended up being no CBus or Time and Sync Bus terminators on the
cables going to this card.

Swap the CTMBD in M109 Tier 2.  Pull CTMBD SN# 14 and install SN# 08.  This
card is being pulled because during data block building it does not read the
first location correctly (an LTCC card).  See last weeks log entry.  After this
swap I did not see any more "No Candidates In LT JL" messages.  But in a
couple hours of global physics running there were about one hundred
"LT JL Overflowed" messages.  Many Control Room people do not understand
the difference between these two messages.  Rich Astur currently has his call
to the routine to make the LT JT set a limit of 20 entries.  Remember to look
at the sort alarms output to understand how many errors there are during a run.

Setup two new Foreign Scalers for Norm Amos.  These will watch the new active
Main Ring Veto setup.

  Foreign Scaler #32, the DBSC Ch #1 in slot 12 CA=35, is feed from the NIM-ECL
  Converter Ch #13 in the Bagby rack.  This is called BX_Counts_of_MR_Veto_Low.

  Foreign Scaler #31, the DBSC Ch #2 in slot 12 CA=35, is feed from the NIM-ECL
  Converter Ch #14 in the Bagby rack.  This is called BX_Counts_of_MR_Veto_High.

  The proper (I hope) edits have been made to  TrgCur:Trics_Boot_Auxi.dat
  to get this new Foreign Scaler information put into the Begin Run End Run
  Pause Run Luminosity files.   Put the old version of Trics_Boot_Auxi.dat
  in [TrgCur.Obsolete].  The new version of Trics_Boot_Auxi.dat is in D0::
  TrgCur:,  D0HTCC::[Trigger],  and MSUHEP::TrgCur:[DZero].

  The proper (I hope) edits have been made to  HTrgMon:TrgMon_FS.RCP  to
  include these new Foreign Scalers in the TrgMon Display.  This file was then
  copied to MSUHEP::HTrgMon:  and to  D0::User1:[Trguser.TrgMon].

  I edited the [D0_Text.Scalers]Scaler_Assignments.Txt to show the new use
  of these scalers.

I checked with Norm and he says that it is OK for me to disconnect the "ECL
Box" setup that I made for him to bring some Muon L1 Terms to his stuff in
the bottom of M124.

I unplug cables going to the Muon system so that they could measure the
electrical length of the cables.  The lengths are:

        Muon L1 Trig's to the L1 Framework                  85 nsec.
        Muon L1.5 Trigs (Answer and Done) to the L1.5 FW    92 nsec.
        Trig-Acq-Sync Cables from L1 FW to Muon             80 nsec.

Pulled the old TCC uVAX II out of the top of M114.

Installed the RMI at the very top of M114.  Need a front panel that is about
5 screws high to cover the rest of the space where the old TCC was.  The output
of this RMI is connected to Entry 03BE of the Shea ADC  CH# 62 of Node 74D.
This is the next channel after the -5.2 Volt monitor from M114.  This RMI is
not yet connected to our RPPS so it can not trip our stuff yet.

Installed drip strips between all of our long string of racks (i.e. ll strips).
I need to install more strips at each end  for the 8 pack radiators and do we
want drip strips in M114 ??   Verified that the connection from the RMI to the
Voltage Monitoring system was reading out OK

Cooked 16 more of the big PROM's  82HS321  for the CHTCR cards.  Each card
required 8 of these parts and there are 2 cards at MSU that need these PROM's
before we can test them.  This leaves us no spare programmed C2R1 parts.  We
still have about 30 unprogrammed parts.
..............................................................................

Date: 1,2,3,4-FEB-1994  At: Fermi  Topics: Save/archive single comint TCC files
                                           Create dual comint version TCC files
                                                        and DIRECT_TO_TCC files
                                   List of COMINT cards now at Fermi,
                                   Made new COMINT to VMX Driver cables,
                             Results of rate tests with Two_COMINT_Operation,
                             Problem of bad Timing Signal TSS-L in the
                             Tier 1's in M111 and M112 Phi 1:16,  Work on the
                             No Candidates in the Large Tile Jet List Problem.

archive TCC files
-----------------
Verify and Copy latest revision of all TCC files from TRGCUR: to [.OBSOLETE],
and save all these files in [.OBSOLETE.ONE_COMINT]. [.obsolete] is now empty and
ready for the dual comint files.

Upgrade all .DAT files for dual comint operation:
----------------
  TRICS_BOOT_AUXI.DAT           for address of end_run scaler readout
                                get ready for "end of run" crash recovery file
  TRICS_INIT_AUXI.DAT           and add reminder of related .DAT and .MSG files
  TRICS_FORCE_BUF_UPDATE.DAT
  TRICS_L1_IGNORE_L15.DAT       and add header, and comments
  TRICS_L1_OBEY_L15.DAT         and add header, and comments
  IGNORE_L0_FAST_Z.DAT          and fix Momentum Lookup programming
  OBEY_L0_FAST_Z.DAT            and fix Momentum Lookup programming

DIRECT_TO_TCC files
-------------------
Delete USER1:[TRGUSER.DIRECT_TO_TCC]RESET_GEOSECT_IN_L15.EXE, it was obsolete
rename USER1:[TRGUSER.DIRECT_TO_TCC]TEMP_NO_L0MI_V80.MSG *.obsolete
Upgrade all .MSG files for dual comint operation, files affected:
    ETA_32_TREE_CORRECTION.MSG
    FORCE_L0_FAST_Z.MSG
    RESET_GEOSECT_IN_L15.MSG

Switch to dual-comint IO
------------------------

The COMINT cards at Fermi are the following:

    The COMINT that had been running for the past year or so is SN# 08  with
    CDBE  SN# 06

    The COMINT that had been the FERMI spare for the past year or so is SN# 09
    with CDBE  SN# 05.   Note that this card was not current on ECO's  i.e. it
    would not have worked.  It did not have the ECO to use the Card Address
    PROM bit to stop the Data Block Builder,  it send out a Data Block Complete
    signal instead of a Data Block Builder Busy signal.

    The COMINT that was brought from MSU to Fermi this week is SN# 06  and
    there is no SN# on its CDBE card.  Steve has just checked the inventory
    of CDBE's.  We will call this CDBE SN# 08.  Next time that this COMINT is
    pulled out we should write the CDBE SN on it.

Currently we have COMINT SN# 06 installed as the Pilot and COMINT SN# 09
installed as the Assistant.  Before we started Two_COMINT_Operation we tested
both COMINT SN#06 and SN#09  in Single_COMINT_Operation.  They were both OK
but L1.5 triggering was not running during these tests (but double buffering
was in frequent use).

Made new COMINT to VMX Driver cables.  The twist and flat cables are 4 twist
sections long.  The single pair cables are about 4 to 6 inches longer than
these new twist and flat cables.   The old twist and flat cable that we
removed was 9 twist sections long (i.e. about 15 feet long)  and the single
pair A13 cable was 10 feet long.

Results of rate tests with Two_COMINT_Operation:

  Running with a prescale of 500 we see the system slowly oscillate between
  two consitions.  The period of oscilations is perhaps in the 10 to 30
  seconds range.  The two limiting conditions are:

                                 Waiting for a
            Hz     FE Busy %     Free VBD Buff %
           ----   -----------   -----------------
            420       20%            15%
            475       0.2%           0.5%

  Running with a prescale of 600 the system is more stable and it generally
  sits around about:
                                 Waiting for a
            Hz     FE Busy %     Free VBD Buff %
           ----   -----------   -----------------
            435       1.5%            1.5%

If we pull the control bus cable to the L1 VBD (so that it thinks that it
always has the "Grant" to send data up to L2)  then we see that L1 builds
and sends Data Blocks at a maximum rate of about  520 Hz.

If we keep the VBD always granted and Stop reading the Vertical Interconnects
then we see a maximum rate of about  602 Hz.

In all conditions except the last one the timer that counts how long VTC has
to wait after finishing reading the Vertical Interconnects until the Slave
Ready signal arrives reads zero,  i.e. it took so long to finish reading
the Verticals that the Data Block Builders were finished before the Vertical
reads were finished.  In the last measurement, were we dropped reading the
Verticals,  then this counter said that there was a  0.38 msec/event wait
between finishing the Vertical reads and receiving the Slave Ready.


Problem of bad Timing Signal TSS-L in the Tier 1's in M111 and M112 Phi 1:16.
TSS-L comes from Cal Trig MTG Ch# 13.  The BBB card for the upper Tier 1's
in M111 and M112 is in slot #17 of the upper backplane in M114.  The back
plane pass through pin that carries non-inverted MTG Ch# 13 out of the BBB
in slot #17 and into the TSS cable appears to be open.  It does not appear
to be epoxy on this pin but rather something like the pin is broken or
burned through in the middle between the connector housing and the backplane
PCB.  We did not try to pull this pin out and replace it.  Rather we switched
which BBB card is driving TSS signals to the M111:M112 Phi 1:16 Tier 1's.
We switched slot #17 with slot #14.  The BBB in slot #14 drives the M111
Tier 2.  Recall that all Cal Trig BBB's output the same set of timing
signals. Run a few tousoands loops of Random test to verify that this solved
the problem encountered by Dan on 20,21-JAN-1994.

Work on the No Candidates in the Large Tile Jet List Problem.

The problem is that the Large Tile at  eta +9:+12  phi 17:24  sometimes does
not register in the pattern of Large Tiles over threshold although it does
participate correctly in generating the L1 Trigger.  This Large Tile is handled
by the lower LTCC card in the M109 Tier 2 card file.  This is MBA=209 CA=11.

The problem is that data line 3 (i.e. bit value 4) sometimes reads out low when
it should be high.  This happens only with the fast Data Block Builder reads.
Programmed I/O reads are OK.   In fact it happens only one the first read in
this card file by the Data Block Builder.  The subsequent reads are all OK.

We checked Terminators (backplane and cable bus).  They are all OK.  We swapped
the LTCC cards and there was no difference.  We swapped CTMBD's and the problem
went away.

There are two spare CTMBD's at Fermi.  Both are wired as Tier 1 CTMBD's.
CTMBD SN# 19 has a very bad solder job and is being returned to MSU for
rework.  There is an IC on this card that has 1/2 of its pins not soldered.
The other CTMBD is SN# 09.  It looks OK and its tag says that it has passed
100k loops of some test.
..............................................................................

Date: 18,19-JAN-1994   At: Fermi  Topics: Distribute new "Start TCC"
      20,21-JAN-1994              Instructions, Give the Detector Shifter
                                  Training talk about L1 Cal Trig, Problems
                                  with the Fancy-214's,  Problems with Vertical
                                  Interconnect reads,  Work on the eta +13:+16
                                  Phi 1:8 CHTCR, Get M114 upper backplane ready
                                  for Two COMINT Operation.

Distribute the new (11-JAN-93) edition of the "Start TCC" Instructions (3
copies).
Give the L1 Cal Trig talk for the Detector Shift Training.
Give a L1 and L15CT talk to the Run Meeting.
Give a Framework and Cal Trig upgrade talk for the Trigger Upgrade Meeting.

We had the Fancy-214's running for about one day.  At first it appeared to
be all OK but in reality there were perhaps at least 4 different problems seen:

1. Overwritten word counts in the Short I/O Block of one of the two Fancy-214's
   With  $FFFFF009  reading 11  there was starting at  $9100
   0005   010a   0280   010a   0280   0041   910c   9140
   00a0   0029   0014   0024   0006   00aa   0080   0000

   it should have been:
   0005   010a   0280   010a   0280   0041   0041   0140
   00a0   0029   0014   0024   0006   00aa   0080   0000

   note that the two overwritten locations have either their address written
   into them or else a compination of their address and their data.   This
   problem was only seen with Fancy-214's in the L1 crate and the Fancy-214's
   were providing the Short I/O memory.

2. There was a significant block of time when the length of the L1 data block
   was about 6000 long words instead of 2843 (or what ever the proper number
   is).

   This problem was seen with Fancy-214's in the L1 crate and the Fancy-214's
   were providing the Short I/O memory.  Does it happen with our old normal
   "C" type 214's ??    This problem did not appear to stop normal Physics
   running operation.   Is this related to the "spurts" of empty or overflow
   Jet Lists??

3. With Fancy-214's operation and prviding the Short I/O Blocks there was a
   period during a Calib run where the Version Number in the header was often
   wrong.  Note that on these events the Controller words and the Revision
   word (both also event to event static values)  that are on each side of the
   Version Number word  were reading out OK.   There version number was
   reading  D008000B  instead of 00000008.    After a while this problem just
   "fixed itself".    Is this related to the "spurts" of Jet List overflows
   and empties ??

4. Sometimes the pulser programming data has  FFFF  in the lower half of the
   longword that is built up from two word reads.  This is perhaps 1 in 500
   events.  This appears even with all OLD  (i.e. "C" type 214's)  in the L1
   VME crate.

More test with the Fancy 214's.   Made a test version of VTC code that reloads
the Short I/O list of Word Counts right before the VBD is going to read them.
Then running with the Fancy-214's this made things much worse.  There were
lots of errors: TF, CP, BX.  Running this code with the old "C" type 214 is
all OK.

To check the Vertical Interconnect reads of the Cal Pulsers I made a test
version of the VTC code that Tests the 3rd word of each pulser to see if it
is $FFFF.  It does this by bringing the Vertical Interconnect data directly
into a working register (i.e. the data comes from the cable into a register
for testing  and not back out of the "V" type 214).

Watching during 70 Hz running I saw about one error every 15 seconds

     P2   26 errors  CC  Pulser
     P8   21 errors  CC  Pulser
     P4    7 errors  ECS Pulser
     P6    5 errors  ECS Pulser
     P9    2 errors  CC  Pulser
     P3    1 error   CC  Pulser
     P7    1 error   ECS Pulser
     P5    1 error   ECS Pulser

Note there are no ECN errors.  From what I here it is a known problem with
Vertical Interconnects not to read back correctly.  Comm Taker uses 5 read
backs.

M114 Backplane

Cut the J4 traces between between slots 14 and 15 on the upper M114 backplane.
Tested for shorts with and ohm meter and then put epoxy over the cut.

Work on the eta +13:+16 phi 1:8 CHTCR

When looking into the EM cable that goes from this CHTCR to the Tier 2 CAT2
you could see something funny with the ohm meter on bit of value 2 for EM Ref
Set 0.  It is not an open or a short to an adjacent conductor but there is
something funny about this input.  The meter reads different for this input
than for any other input on this cable.  Looking with the scope and the Diff
ECL box nothing looked bad but the DC levels of this input bit looked
different.

I was going to try swaping CAT2's but I wanted to see this problem with
Cal Trig Test before I changed things.  After waiting an hour to get a tube
I had trouble with Cal Trig Test.  Running in just eta 1:16 it made a Px
and/or Py error about once every 1000 loops.  Running with full eta it made
a Px and?or Py error about once every loop to about once every 10 loops.
Because I could not see the CHTCR problem I did not swap CAT2's.
..............................................................................

Date: 13,14-JAN-1994   At: Fermi  Topics: First Test of the Fancy-214's in the
                                          Level 1 System.

The first test of the Fancy-214's in the Level 1 system did not work out at
all.  The block going up to L2 was all junk and the length was wrong. The
was true of all events transfered up to L2.  Backed up to the old "C" type
214's.
From the 133ABug the data at $380000 looked like all junk. The data at $9010
was OK.  The data at $9100 was OK.  The data at $305000 was OK.  The data at
$B000 was OK.  The data at $B800 was OK.  The length up to L2 was wrong --->
the VBD was picking up the wrong length counts from Short I/O Address Space.
..............................................................................

Date:   13-JAN-1994     Remotely from MSU:
                        Topics:     Eta +/- 1..20 Phi 6..32 CTFE PROMs tested
                                    Random Test Runs
                                    Another Power glitch
                                    Eta +/- 17..20 CHTCR PROMs were tested.

Eta +/- 1..20 Phi 6..32 CTFE PROMs tested
-----------------------------------------
Finish systematic readout of all Lookup PROMs (started 7-jan).
The test code still doesn't know how to deal with Tier#1 to tier#2 truncation of
the MSB (test .le. 3 phis at a time), or negative numbers read from Tier#3.

The test was carried on all etas for phis 6..8, 9..11, 12..14, 15..16, 17..19,
20..22, 23..27, 28..30, 31..32.

See earlier entry for test method. I had to program an offset in the Px/Py
Global sums to keep them positive. I loaded 1/2 of full scale in the
correction register of the Tier#3 CAT3 cards  WRITEREG 0 153 39 49 16 and 0
153 37 49 16. This wasn't required for phi 1..8, which must have naturally
kept a positive sum, but is now necessary for phi 9..16.

One error detected for HD PROM at +17,12 Page #3
(error stays when redoing same loop 3 times)
    HD PROM answer is 37 instead of 39 global is now 8963
    HD CAT inputs are   28,  28,  28,  58,  28,  28,  28,  28
    Error Detected at POS,E_17,P_12 page #3 EM 255 & HD  36
later test HD PROM at +17,12 Page #3 by itself, and get the same error
    HD PROM answer is 37 instead of 39 global is now 5214
    HD CAT inputs are   28,  28,  28,  58,  28,  28,  28,  28
    Error Detected at POS,E_17,P_12 page #3 EM   0 & HD  36

Two errors detected for Px PROM at +3,15 Page #3
(error stays when redoing same loop 3 times)
(note that there is a bug in the display code, it should say +41 and not -41)
    Px PROM answer is -41 instead of 57 global is now 4193803
    PX CAT inputs are    0,   2,   4,   4,   5,   6, 505,   7
    Error Detected at POS,E_3,P_15 page #3 EM 128 & HD   0

    Px PROM answer is -41 instead of 57 global is now 4193803
    PX CAT inputs are    0,   2,   4,   4,   5,   6, 505,   7
    Error Detected at POS,E_3,P_15 page #3 EM 129 & HD   0
    PROM Failed Test at POS,E_3,P_15,EMETZ0 Page #3

    Test EM and HD channels at +3,15 alone on Page #3, and get same error
on Px PROM.  Then try different values of EM and HD by hand. After 0 loops
of radom test, one needs to release the MTG clock by WRITEREG 0 105 53 33
100, aim both read and write to pipe A with WRITEREG 0 105 53 3 9 and 0
105 53 5 9. Then use a combination of Tree browsing and WRITEREG to FA 81
and 82 to set different values and read the CTFE's 29525 and CTFE partial
sums. The PROM is short by 16 counts whenever the input is 128/2 (with
EM=128/HD=0, EM=0/HD=128, or EM=127/HD=2), while inputs of  127/2 and 130/2
are ok.

Another Power glitch
--------------------
Level 1 lost power, and TCC seem stuck in the "Initialize all Framework
Registers". I looked in the logfile from when this happened. TCC was simply
going along its initialization, but every write was waiting for the CBUS and
timing out (but after about 2s a piece), and then reading back 0. This is
the same symptoms as were noticed on 3-jan.

    Again, it seems that the ZRL pQPA got screwed up, but not enough to
raise its internal error flag so that TCC could know it.

Eta +/- 17..20 CHTCR PROMs were tested.
--------------------------------------
the last of the CHTCR PROMs were successfuly tested, see previous entry for
test method.

Random Tests
------------
Tests now fail after some number of loops on the Global EM Tower count for
Ref #0. The count becomes short by 2, then catches up. In order to make
sure that the  test catches the problem right away, limit it to all
eta/phi, page 4 only, and EM Ref #0 only. The count is always changing
between Bad-Good when the test plays with a tower at eta +13..16, phi 1..8.

further diagnosis: let the Random test catch an error. Use Tree.Browser to
locate the culprit. The input to Tier#2 from this CHTCR reads 17 while the
readout of the CHTCR inputs shows that 19 bits are set.

The CHTCR PROM test now also produces lots of errors, while testing Tier #1
PROM #1, #2, or #3, and the Tier #2 PROM. Note that this CHTCR passed the test
on 21-DEC-1993. It is sometimes (I don't believe always) missing the second bit
(i.e. short by 2 counts). It is not obviously correlated to the first or third
bit. When it acts up, it not an intermittent problem.

One should suspect a problem with the second tier PROM on this channel (in
its output circuitry), or pin/socket connection, or trace short, or water
damage, or cable/connector in the front of this card, or maybe instead a
problem with the input of the following Tier#2 CAT2.

One could use the random test to bring the card to a point of failure, unplug
the front cable and check the differential voltages for the second bit.

I have the strong impression that this problem is now appearing more often (i.e.
on more combination of bits?) than it was on 20..21-DEC...??!!
..............................................................................

Date:   5,6,7-JAN-1994   At: Fermi   Topics: Install the last 6 of the Rev
                                     1993 Term-Attn,  Investigate the readout
                    problem at +13,1,  Repair problems that Dan Owen found in
                    the pulser runs,  Reworked the AC power for the new TCC and
                    its BA23 box, New procedures for booting the L1 68k and for
                    booting TCC,  Look at Temperatures,  What is connected to
                    eta 20 EM.

Installed the last 6 of the new 1993 Term-Attn networks.  They were:
  -7,1    -8,1    +12,1    +12,2    +12,3    +12,4

Investigated the problem with +13,1 EM and HD reading out funny values
(typically 247) but participating OK in the generation of triggers.  The
problem was that there were  NO  terminators (either CBus or T&S Bus) on the
cable from M114 to the phi 1:16 cardfiles in racks M107, M108, M109, M110.
This string of racks is feed from the M110 end so the CTMBD in M107 needed to
have Terminators.  I plugged in the 110 ohm packs and the +13,1 readout now
looks OK.  What other troubles could this have caused?  NO terminator on the
Timing and Sync Bus !!   This is very likely why we replaced this BBB a couple
of weeks ago.   That BBB (SN# 9) is very likely all ok.

Repair some Trigger Tower problems:
 -5,28  EM  was reading 25%.  It had a bad Term-Attn.
-16,9   HD  was reading 60%.  It had a bad Term-Attn.
+15,12  EM  was reading 60%.  CTFE SN# 277 had wrong value at R107  (1k vs 3k).
+20,12  HD  was reading 60%.  CTFE SN# 357 had wrong value at C303  (220-15pf).
-17,18  EM  was reading 60%.  CTFE SN# 207 had wrong value at C295  (220-15pf).
-20,22  HD  was reading 50%.  Broken connector at L1 end of BLS cable replaced.

AC power for the New TCC and its BA23 now comes from the strip line inside rack
M114.  This strip line is powered from the unswitched 115 AC outlet on the
Contactor Box for M114.  Thus the new TCC has the same source of AC power and
safety gnd as the rest of the equipment in M114.  This is how the old TCC got
its AC power.

I finished writing new procedures for booting the L1 68k (VTC) and for booting
the new TCC.  These were posted on the West end of M113 and in the DAQ expert
notebooks.  These new procedures are in the files  TrgMisc:Start_68020.txt
and  TrgMisc:Start_TCC.txt.   People should check and correct the Start_TCC
file.  I have put labels on the new TCC box (near its On-Off switch) and on
TCC's  BA23 box.   The file [D0_Text.Software.Trics_Doc]Boot_Procedure.txt
needs to be looked at and modified for the new TCC (e.g. pushing reset, typing
B, where TCC is located).

    5-JAN-1994  Everything ON           7-JAN-1994  Everything ON
    -------------------------           -------------------------
    air flow  300 to 310 lfpm           air flow  300 lfpm
    water temp  54.7                    water temp  54.3
    68.5     67.6    109.8              68.0     67.1    109.6
    68.4    110.4    110.4              67.9    110.0    110.1

What is connected to eta 20 EM ???   People are looking at BLS resistors but
it looks like the highest eta HD elements (i.e. cal eta 4.4 are connected to
our 20 EM.  I had asked for this highest HD stuff to be connected to 20 HD.
If it is connected to 20 EM  then what trouble can it cause??  What is in the
20 EM PROMs?  Dan Owen let the trigger meisters know not to let the EM Ref
Sets go out to include 20.
..............................................................................

Date:  3-JAN-1994      At: MSU    Topics: Trouble at Fermi  ReBoot D0HTCC
                                          (reboot via power cycle)

        From:   D0::TRGUSER       3-JAN-1994 10:33:42.40
        Subj:   TRICS V5.2/02-JAN-1994/  Exit Refresh Monit Pool

        From:  D0::TRGUSER       3-JAN-1994 13:11:59.28
        Subj:  TRICS V5.2/03-JAN-1994/  Booting

At about 11:45 AM (Chicago time) the Control Room called claiming that
the TCC could not talk to the Framework or the Cal Trig.  According to
the TRICS Log the problem began at about 10:30 AM (Chicago time).  Before
10:30 there were no errors, after 10:30 every attempt to communicate with
the COMINT appeared to fail.  The last mail message was an "Exit Refresh
Monit Pool" at 10:33.

Jan was in the Control Room, so they called us before trying anything
drastic (i.e. they didn't reboot or turn any power off, or try EDEBUG).
Sal (Fahey?) was DAQ Expert and he thought that the problem was correlated
with several L0 high-voltage supplies tripping off.  Recall that either the
BA23 or the uVAX 4000 box gets its 110V from the same strip which services
some of the L0 high-voltage supplies.

Rack M114 had power (low-voltage monitoring indicated correct voltages, and
the LEDs and orange AC indicator were on), also the uVAX 4000 and the BA23
box had their AC indicator lights on.  I could talk to the uVAX via TRICS,
but all CBUS WRITEs failed (all FAs read back 0).

I had Sal power-cycle the uVAX 4000 and the BA23 box.  This appeared to
solve the problem.  The "Booting" mail message was sent at 13:11 (Chicago
time).

I wish I had had Sal try to TRIGGER the node.  I didn't try that but I
believe that the node would have TRIGGERed OK.  I don't know whether
TRIGGERing the node would have solved the problem, though.  If this happens
again we should try TRIGGERing the node before power-cycling.

Sal claimed that there were instructions for booting the TCC (but he didn't
say where they were, i.e. were they taped to the rack or were they in
some log book?) which were incorrect.  He asked where the RESTART button
on the uVAX 4000 was.  I did not tell him because the RESTART button is
not clearly marked and it is mixed in with some DIP switches, etc. which
should not be touched.

Added by Philippe 12-JAN-1994:
    - Inspect the logfile TRICS_02JAN94.LOG;1 which includes the pQBA problem.
      No pQBA device error was logged at the initial problem time.
      But the end of the logfile (12:51) shows what probably was a manual
      power-cycling of the BA23.
      The logfile shows that the pQBA device was then woken up and that
      TRICS reset it.
      There is also a successful write afterwards, before TCC was rebooted.

      Whatever happened was significant enough to screw up the pQBA, but not
      enough to wake up its power problem error flag that is checked by the ZRL
      Interrupt Service Routine. The actual power cycling did, and TRICS had a
      chance to reinitialize the pQBA and the DRV11J(s)

..............................................................................

Date:  2-JAN-1994      At: MSU    Topics: Trouble at Fermi  Required a Power
                                          Cycle ReBoot D0HTCC to fix it.

They called at about 4:30 AM.  I ended up having them climb up on the ladder
and power cycle boot TCC.  TrgMon could not connect to TCC, COOR could not
connect to TCC.  I'm 99% certain that EDEBug could connect to TCC although I
did not do this myself.  Triggering TCC from NCP did nothing  i.e. NCP
triggering did NOT cause TCC to boot.   I only had them power cycle the 4k
box.  The BA23 was not power cycled.

The last mail message of from TCC was an Init at 4:09   The power cycle boot
was at about 5:10 Sunday morning Jan 2nd.

Mary Ann Cummings was the DAQ expert on shift.  I'm not sure what happened
but there is some story like this:

   The L2 Graphics program was running at the same time that she switched
   the L2 nodes over to collider mode.   Booting the L2 nodes while this
   new L2 Graphics program is running is known to cause EtherNet problems.

I expect that we should talk to Jan to understand the details of this L2
Graphics vs change over problem.   Perhaps if there is time this week then
we could crash the system in this way on purpose so that Philippe can see
what happens inside TCC.

As listed in the entry in this log book from Dec 20th we need to make new
written instructions about how to boot the TCC.  Actually I do not think
that we ever have had written instructions about TCC booting ?   New
instructions could include: NCP boot vs power cycle boot, TCC boots from
its disk even though one normally causes it to boot via a network command.
..............................................................................
               Log book for 1993 is in D0_HALL_LOGBOOK.LBK_1993
..............................................................................