This is the log book for the TRICS software
                              started 22-JAN-1992 

================================================================================
  +--------------------------------------------------- Updated - 7-JUL-1994 -+
  |Code development Policy:                                                  |
  |- DZERO::EWORK1 has current working code.                                 |
  |- DZERO::ETRICS has last stable code. Backup EWORK1 before modifying.     |
  |- MSU::ETRICS is a backup of DZERO::ETRICS:                               |
  |- MSU::EWORK1: is a backup of DZERO::EWORK1:                              |
  +--------------------------------------------------------------------------+
  |Archival Policy:                                                          |
  |- subdirectory [.TCC] has the files needed on TCC                         |
  |- subdirectory [.TRGUSER] has all important files from the trguser account|
  |- subdirectory [.TRG_LIB] has all important files for linking purpose     |
  +--------------------------------------------------------------------------+
  |Code Update reminder:                                                     |
  |- Check Version number and comments in SITE_DEPENDENT.CST                 |
  |- SITE_DEPENDENT.CST different parameters  -- cf *.CST_MSU and *.CST_DZERO| 
  |- TRICS_Vnm.DAT different node name+number -- cf *.DAT_MSU and *.DAT_DZERO|
  +--------------------------------------------------------------------------+
================================================================================

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  19-JUN-1996 Philippe:    MSU::                 load system from 24-MAY 
                                                 find problem, make new system

    - kludge version of MOD103_HANDLE_SCALERS.PAS
      Make version that does not toggle the scan/reset line on the 36x36
      scan/reset MTG
    - Modify MOD227_PHAT_EXECUTE.PAS
      change argument checking restriction that was limiting MOD_HDB messages to
      CBUS <=2.  Now set to CBUS<= 3.

    - make and load new system EWORK1:TRICS_V64.SYS_19JUN96
      The old ("production") system is still in EWORK1:TRICS_V64.SYS_24OCT95

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  24-MAY-1996 Philippe:    MSU::                kludge system to run with 
                                                L1.5 CT and 36x36 scalers off 

    - kludge version of MOD100_HANDLE_L15CT.PAS
                    and MOD100_HANDLE_L15CT.PAS
        that has nearly empty routines for handling the L1.5CT VME IO

    - new system is EWORK1:TRICS_V64.SYS_24MAY96

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  26-FEB-1996 Philippe:    MSU::                 clean up disk directory 

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*JAN95.LOG and TRICS*FEB95.LOG to MSUD02$DUA1:[TCC_LOG_IC]
        - also copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG from FEB96
        - delete TRICS* logfiles from Dec, jan, feb
        - Delete MPOOL*/LOG*/MAIL*.LOG from Dec, Jan, Feb
        - [TRIGGER] now uses 59k blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  17-JAN-1996 Philippe:    MSU::                 clean up disk directory 

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*DEC95.LOG to MSUD02$DUA1:[TCC_LOG_IC] ** use new DIR **
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG 
        - delete TRICS* logfiles from Dec
        - leave MPOOL*/LOG*/MAIL*.LOG from Dec for now (will kill next month)
        - Delete MPOOL*/LOG*/MAIL*.LOG from Nov (left from last month)
        - [TRIGGER] now uses 57k blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  16-NOV-1995 Philippe:    MSU::                 clean up disk directory 

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*OCT95.LOG to MSUD02$DUA1:[TCC_LOG_IC] ** note new DIR **
        - copy TRICS*NOV95.LOG to MSUD02$DUA1:[TCC_LOG_IC]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG 
        - delete TRICS* logfiles from Oct
        - leave MPOOL*/LOG*/MAIL*.LOG from Nov for now (will kill next month)
        - [TRIGGER] now uses 41k blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  7-DEC-1995      Philippe:                 TCC needs reboot

    - Power supply failure in upper tier 1 supply in rack M105. (probably) Right
      after the system was back on, all TRGMON sessions were dumped, and could
      not be restarted.  COORs requests to TCC were not being serviced.

    Here is what could be reconstructed by looking at logfiles:

  11:52  COOR Initialized TCC with Caltrig not turned on
            this must be to switch to the special run

  11:54  Initialization completes with a bunch of errors since most of the
            hardware is turned off, but no problem here

    ...  Everything is fine, COOR sends requests and TCC can perform IOs to
         the hardware all ok.

  13:59  "SOMETHING" happens preventing all subsequent IOs from TCC to L1FW+CT

  14:02  the Monitoring server, more exactly ITC, dumps all TRGMON connections
         and quits accepting new ones.
         The error in TCC's Mpool_server was
            ITC-E-NO_CHANNEL, Channel requested has not been activated

         COOR keeps sending requests to unwind from the special run. TCC is
         bogged down by slow and failed IOs as TCC it get control of the
         L1FW+CT bus (remember this bus is shared between TCC control and high
         speed event readout).  Each individual IO has to time out (7s): this
         takes forever. 15 mn later TCC hadn't even got to the INITIALIZE
         request from COOR.

    Now for what the "something" was: we believe the ZRL pQBA interface in the
BA23 QBUS enclosure tied to the microVAX 4000/60 TCC must have gotten
corrupted. Flipping on power supplies in L1 CT might have been correlated to
this. The pQBA  stayed hosed until it could be reset.  
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  4-DEC-1995                                First weekly reboot of TCC

    - TCC is now rebooted every Monday.  During "solid store operation" this
      should happen before flipping magnet polarity, as the later is done after
      the protons are already in.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  16-NOV-1995 Philippe:    D0::                 clean up disk directory 

    - clean [TRIGGER] directory.  Delete (not saved) files from oct.
      [trigger] now holds 42,480 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  15-NOV-1995 Philippe:    MSU::                power failure, bug noticed

    - After a sitewide power failure, TCC was booted before D0HSC was available.
      This caused an access violation in the booting sequence that hung TCC.  
      This happened when TCC was trying to write the begein/end run file to
      capture the scaler counts before initializing them; the file could not be
      written to the host, and the error message that was trying to advertize
      the fact tried to OPTIONAL access argmunents not passed from this
      initialization call.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  30-OCT-1995 Philippe:    D0::                 clean up disk directory 

    - clean [TRIGGER] directory.  Delete (not saved) files from july, and sept.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  23-24-OCT-1995 Philippe:    D0::              add 36x36 bunch test scalers

   - New system in EWORK1:
      TRICS_V64.SYS_23OCT95 add support for the 36x36 bunch test scaler crate
                            and control line length of the show sptrg message

   - TRICS_V64.SYS_24OCT95;1 Fix bug introduced preventing SBSC loading

   - TRICS_V64.SYS_24OCT95;2 Fix bug preventing finding 
                             all mtg36x36_ctrl registers

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  20-JUN-1995 Philippe:    MSU::                timing of COOR message execution

    - numbers in seconds, population sample on the righthand column.
      note that the first "SPECTRIG L15_TYPE" takes longer, as the special
      command file TRICS_L1_OBEY_L15.DAT needs to be executed

  SPECTRIG   ENABLE     %%    min=  0.10   max=  0.10   ave=  0.10      n = 1
  SPECTRIG  FEBZDIS     %%    min=  0.09   max=  0.17   ave=  0.09      n = 30
  SPECTRIG  RD_TIME     %%    min=  0.08   max=  0.12   ave=  0.09      n = 30
  SPECTRIG ANDORREQ     %%    min=  0.12   max=  0.22   ave=  0.12      n = 30
  SPECTRIG L15_TERM     %%    min=  0.09   max=  0.17   ave=  0.10      n = 12
  SPECTRIG L15_TYPE     %%    min=  0.08   max=  1.43   ave=  0.17      n = 15
  SPECTRIG OBEYBUSY     %%    min=  0.08   max=  0.11   ave=  0.09      n = 30
  SPECTRIG OBEYLEV2     %%    min=  0.08   max=  0.17   ave=  0.09      n = 30
  SPECTRIG PRESCALE     %%    min=  0.08   max=  0.18   ave=  0.09      n = 58
  SPECTRIG STARTDGT     %%    min=  0.09   max=  0.10   ave=  0.09      n = 30
    REFSET     EMET     %%    min=  0.10   max=  0.26   ave=  0.17      n = 25
    REFSET LRG_TILE     %%    min=  0.10   max=  0.12   ave=  0.10      n = 14
  THRESHLD  EMETCNT     %%    min=  0.09   max=  0.09   ave=  0.09      n = 5
  THRESHLD MISPTSUM     %%    min=  1.26   max=  1.36   ave=  1.31      n = 3
  THRESHLD TOTETCNT     %%    min=  0.09   max=  0.09   ave=  0.09      n = 8
  ST_VS_RS TOT_LIST     %%    min=  0.09   max=  0.17   ave=  0.09      n = 16

     PAUSE              %%    min=  0.15   max=  0.27   ave=  0.21      n = 5
    RESUME              %%    min=  0.15   max=  0.23   ave=  0.17      n = 4

  L15CTERM   REFSET     %%    min=  0.09   max=  0.12   ave=  0.10      n = 9
  L15CTERM  LOC_DSP     %%    min=  0.09   max=  0.09   ave=  0.09      n = 3
  L15CTERM FRAMECOD     %%    min=  0.08   max=  0.16   ave=  0.11      n = 3
  L15CTERM GLOB_DSP     %%    min=  0.09   max=  0.09   ave=  0.09      n = 3
  L15CTERM ST_VS_TM     %%    min=  0.09   max=  0.09   ave=  0.09      n = 3
  L15CTSYS    START     %%    min= 11.27   max= 11.58   ave= 11.31      n = 80
  L15CTSYS LOADCODE     %%    min= 13.08   max= 13.68   ave= 13.33      n = 79

  WRT_HOST  END_RUN     %%    min=  0.19   max=  0.19   ave=  0.19      n = 2
  WRT_HOST BEG_STOR     %%    min=  0.18   max=  0.18   ave=  0.18      n = 1
  WRT_HOST  SYNCHRO     %%    min=  0.08   max= 11.36   ave=  --

   INITIAL              %%    min= 50.35   max= 51.59   ave= 50.66      n = 99

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  15-JUN-1995 

    - Jan caught a problem with TCC:

        We were in the middle of a store.  We had just done an end run, change
    prescales and begin run.  Then about 10 minutes (or so) into the new run
    I noticed that all the L2 nodes were in wait data.  We were not getting 
    any events into L2.  I looked at TRGMON and saw a single line message in
    the middle of the screen.  I forget exactly what it said, but it was
    telling us that TRGMON could not talk to TCC.  I then tried edebugging
    TCC and it looked ok.  I did a directory of TCC's disk, and that
    responded.  Dean and I then went in to look at TCC's console.  We didn't
    see anything unusual.  It just looked like it froze.  When I edited TCC's
    log file, the last line was that it was closing the log file.  I then
    edebugged TCC again and got some output for you, shown below.  TRGMON
    still couldn't talk to TCC.  So at this point we triggered TCC.  We have
    now redownloaded the triggers and are running happily.

    Answer:

  After examining the logfiles from TCC, I would believe that the problem was 
related to screen access.  All of TCC's jobs and subprocesses have to 
synchronize to avoid writing at the same time.  There is a semaphore to handle 
this, and it has been peforming reliably as far as I can tell.   It hasn't 
always been simple, and I had to address some difficulties -- e.g. when 
interrupt routines want to make screen IO, or for special time critical 
sequences -- but I believe everything has been ironed out by now.  There is 
also an emergency recovery that can automatically de-jam the semaphore: it was 
tried last night but seemed to not have been enough.

  There are two possible explanations for last night's lock-up.  Either there 
still is some possible but rare sequence of event that I don't understand and 
can get us in trouble once every 6 month. But I couldn't see any trace of 
improper activity at the time.  The other possibility was that the screen was 
physically locked up, either by someone pushing the hold screen key, or by 
hardware failure. 

extracted from logfiles:

%% time: 09-JUN-1995 13:42:02.02
 TRICS V6.4   CLOSED LOGFILE, DUA0:[TRIGGER]LOG_SERVER_09JUN95.LOG                                  
%% time: 15-JUN-1995 18:15:28.58
E-EXC/MBX% Message Mailbox is Full but Not Signaled                    
S-EXC/MBX% Flush_to_File now Servicing Exception Mailbox               
X-WAI/CNS%flush% Console Locked for 5s, Recover: Force Unlock          
S-EXC/MBX% Exception Mailbox now empty                                 
%% time: 15-JUN-1995 18:15:28.91
 TRICS V6.4   CLOSED LOGFILE, DUA0:[TRIGGER]LOG_SERVER_09JUN95.LOG     

C-RCV/CH1%   1:17  %00001034   RESUME                                       
S-PRS/CHK% COOR Lets Framework Resume                                       
C-ACK/CH1%   1:44  %00002201   ACKNOW 00001034       OK     DONE            
S-MPL/FRH% Start Getting Fresh Data Blocks @ 15-JUN-1995 18:04:55.15        
%% time: 15-JUN-1995 18:06:09.71
 TRICS V6.4   CLOSED LOGFILE, DUA0:[TRIGGER]TRICS_09JUN95.LOG               

%% time: 15-JUN-1995 05:38:11.01
I-MAI/SRV% Mailed to TRGMGR: TRICS V6.4/09-JUN-1995/  COOR Initializing Trigger
%% time: 15-JUN-1995 05:38:44.21
 TRICS V6.4   CLOSED LOGFILE, DUA0:[TRIGGER]MAIL_SERVER_09JUN95.LOG   

%% time: 15-JUN-1995 18:12:15.22
S-MON/SRV% Channel #4 Disconnected after generating 6 messages   
%% time: 15-JUN-1995 18:15:30.59
 TRICS V6.4   CLOSED LOGFILE, DUA0:[TRIGGER]MPOOL_SERVER_09JUN95.LOG 


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   9-JUN-1995 

    - TCC was rebooted because it had quit writing to its TRICS logfile. 
      It had been up for almost 10 days, and leaking memory anyway.
      All other logfiles seemed fine and active.  Remote file access was happy
      too. It seems like one IO to the logfile failed, and TRICS swithced over
      to its protective mode of no longer writing to the logfile.

      As far as we know, this is the first time this happened.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   5-JUN-1995 

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*APR95.LOG to MSUD02$DUA1:[TCC_LOG_IB]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG 
        - delete TRICS logfiles from April
        - [TRIGGER] now uses 84,825 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
31-MAY-1995

    - Jan rebooted TCC this afternoon.  The store was just being scraped, and
      there was an "apparent problem" between COOR and TCC.   As I understood
      it, they sent an initialize command and never got notified that it
      completed.   Moreover, she said the TRGMON display did NOT show that the
      time since initialize was reset. Jan didn't take any chances and rebooted
      TCC.  I do not dispute that. 

      But I still wanted to find out what really happened.  Jan saw the DECNET
      links still alive, the COOR logfiles seemed to just show that the timeout
      value was changed (yes, it seems that Bruce just changes the timeout back
      and forth) and then nothing else.

      I looked in TCC's logfile.  It did get the INITIALize message on time, and
      acknowledged 50 sec later, as usual.  There was absolutely no sign of
      problem. So maybe Jan was too quick to give up, or there might be
      something sick in COOR (some new bug introduced recently?).  

    - It became clear that there was absolutely nothing wrong.  People on shift
      (including Jan) are used to get the feedback from COOR saying "TCC
      acknowledge timeout".  They no longer get the message.  It seems that they
      don't get any aknowledgement in COOR's logfile (but I dont think they get
      an acknowledgement for ANY message, only when bad), just the change of
      timeout message.   

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29-MAY-1995

    - TCC disk problem again. Not during store, MR was in access.

    - The SCSI port driver PKCDRIVER starts forgetting (after running fine for 
      several days) to delete messages that pile up and consume "Pool Blocks"
      (one per message lost), and memory.  This time I could see that PKCDRIVER
      had not run out of pool blocks, nor exhausted all the memory.   I need to
      spend more time on this.  After a 30 sec  look, I still don't know for
      sure which quota it bumped against. 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  22-APR-1995 and 24-APR-1995 

    - Jan boots TCC along with the rest of the ELN mob to stay clear from name
      server problems.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  17-APR-1995 Philippe:  
  
    - TCC had another spell of ELN/Disk problems last night; DAP code = 01F77C54
      2.5 hours of low lum beam were lost.  
      The DAQEXP waited 2 hours before calling for help. 
      The last entry in TRICS logfile is from 16-APR-1995 17:05
      Mail messages to TRGMGR shows last COOR INIT at 16-APR-1995 03:52

    - a mail message was sent to Jan and Stu Fuess (CC D.Owen)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   6-APR-1995 Philippe:  

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*MAR95.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG 
        - delete TRICS logfiles from March
        - [TRIGGER] now uses 21,600 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  31-MAR-1995 Philippe:  

    - Steve notices that TCC is in trouble again.
      This is identical to 30-DEC-1994, 25-MAY-1994, and 12-JUL-1994.
      Trying to do a directory on TCC produces:
      -RMS-F-NET, network operation failed at remote node; DAP code = 01F77C54

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  30-MAR-1995 Philippe:  

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*JAN95.LOG and *FEB95 to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG 
        - delete TRICS logfiles from January and February
        - [TRIGGER] now uses 72,800 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   1-2-MAR-1995 Philippe:  

    - move V6.3 from EWORK1: -> ETRICS: 

    - Archive TRICS V6.3 to MSUD01::DUA1:[ARCHIVE.TRICS_V63]
        done from MSUD01:: using 000_ARCHIVE.V63

    - EWORK1: update version number V6.3 -> V6.4

    - Copy new files from MSU top DZERO
        MPOOL_SERVER.PAS 
        MPOOL_DATA.TYP 
        MOD223_COOR_GLOBAL_EXECUTE.PAS 
        SITE_DEPENDENT.CST_MSU/DZERO
        MOD067_HANDLE_ZRL.PAS
        MOD245_PHAT_DISPATCH.PAS 
        MOD227_PHAT_EXECUTE.PAS   

    - build New System EWORK1:TRICS_V64.SYS_2MAR95, load it
    - old one is in ETRICS:TRICS_V63.SYS_19OCT94
        - New Monit Pool Server for TRGMON with longer integration time.
            (only one mpool_server now)
        - Fix message length for mail message "BAD returned to COOR".
            (this problem was crashing the mail server)
        - Add new message to change the threshold for the error message filter.
            This would be useful to see ALL error messages during initialize.
            (boot default value is 50)
            $ @EENV:COMMANDS
            $ PHAT ERR_FILT nnn
            with  1 <= nnn <= 9,999,999

    - Update TRGUSERROOT:[TRGMON]
        release of TRGMON V6.1 with option for setting longer integration time
        - old files are renamed from x.y to x.y_OLD and will be saved for awhile
        - "improve" the menus for setting integr. time and polling/refresh time 
            (these are now 2 separate comands)
        - Add options during Print/Dump Screen to close the file or change name 
        - add <Form_Feed> between screen dumps.
        - sense and use full screen length during program startup.
        - Only one version of TRGMON left (no more TEST version)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   7-FEB-1995 Philippe:  

    - Dan had power off for a long time. The ZRL hardware does not hang this
      time. TCC is now resetting the ZRL interfaces during initialize. This is
      not a proof that all problems are solved. Also read the pQBA/pVBA
      registers before and after the Initialize. No difference 

        pQBA Register Base Address                              = %X34000000
        pQBA Q-Bus Error Address Reg  - Longwrd (QBus Address)  = %X00000000  
        pQBA Q-Bus Interrupt Reg INT0 - Word #1 (QBus Int Vect) = %X0020
        pQBA Q-Bus Interrupt Reg INT0 - Byte #3 (Qbus Int Enb)  = %B00000000
        pQBA Q-Bus Interrupt Reg INT0 - Byte #4 (QBus Int Pend) = %B00000000
        pQBA Q-Bus Interrupt Reg INT1 - Byte #1 (QBus Err Msk)  = %B00000000
        pQBA Q-Bus Interrupt Reg INT1 - Byte #2 (Reset Error)   = %B00000000
        pQBA Q-Bus Interrupt Reg INT1 - Byte #3 (QBus Err Enb)  = %B10000000
        pQBA Q-Bus Interrupt Reg INT1 - Byte #4 (QBus Ext Err)  = %B00000000
        pQBA Reset Register           - LSBit   (1 = Problem)   = %B0
        
        
        pVBA Register Base Address                              = %X35000000
        pVBA Bus Control Register     - Longwrd (Bus Ownership) = %XFF7F7F7F
        pVBA Error Status Register    - Byte #1 (Error Mask)    = %B11111111
        pVBA Mailbox FIFO Register    - Byte #1 (Register Numb) = %B00000000
        pVBA Mailbox FIFO Register    - MSBit   (1=FIFO empty)  = %B1
        pVBA VME Interrupt Reg INT0   - Byte #1 (Vector Number) = %X00
        pVBA VME Interrupt Reg INT0   - Byte #3 (IRQ enab Mask) = %B11111111
        pVBA VME Interrupt Reg INT0   - MSBit   (0 = Int Pend)  = %B1
        pVBA VSB Interrupt Reg INT1   - Byte #1 (Vector Number) = %X00
        pVBA VSB Interrupt Reg INT1   - Byte #3 (Int enab Mask) = %B00000000
        pVBA VSB Interrupt Reg INT1   - MSBit   (0 = Int Pend)  = %B1
        pVBA Reset Register           - LSBit   (1 = Problem)   = %B0

    Note that the byte #3 of pQBA INT0 should read %X80 instead of %X00. 
      

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  27-JAN-1995 Philippe:  

    - status of backup of TCC logfiles 
        all of run Ib logfiles are in MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
        another copy is in            MSUD02::DUA1:[TCC_LOG_Ib]
        run Ia files are in           MSUD02::DUA1:[TCC_LOG_Ia]

    - There are 2 DAT tapes with a copy of all the files from run Ia and Ib from
      92, 93 and 94 (up to TRICS_30DEC94.LOG).
      (note that for some files from 1992, only the COOR messages were saved
      in files with suffix *.MSG)
      A listing of their content is in the file drawer labelled "TAPES" in the
      "lobby" of the E-shop 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  26-JAN-1995 Philippe:  

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*DEC94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG 
        - delete TRICS logfiles from December
        - Purge all files (nothing to purge)
        - [TRIGGER] now uses 13,500 blocks

    - The December logfiles are also backed up to MSUD02::DUA0:[TCC_LOG_IB]
        (also backup the November logfiles)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  12-JAN-1995 Philippe:  

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*NOV94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG 
        - delete TRICS logfiles from November
        - delete MPOOL*.LOG, LOG*.LOG, MAIL*.LOG from October
        - Purge all files (nothing to purge)
        - [TRIGGER] now uses 66,500 blocks

    - The NOV logfiles are NOT YET backed up to MSUD02::DUA0:[TCC_LOG_IB]
        (done 27-JAN-1995 )

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  30-DEC-1994 Philippe:  

    - Jan (ust back from vacation) called
      They were trying a new COOR (which was unrelated), after the initialize,
      the system stayes stale, with no sptrg #31, no lights on the L1FW

      This is identical  to  25-MAY-1994,12-JUL-1994 and probably 13-JUN-1994 

    Trying to do a directory on TCC produces the error message:
%DIRECT-E-OPENIN, error opening D0HTCC::DUA0:[TRIGGER]TRICS_*.LOG; as input
-RMS-F-NET, network operation failed at remote node; DAP code = 01F77C54

      Edebug CTRL/C>halt 8,2    (FLUSH->FILE)
      Edebug CTRL/C>set ses 8,2
      Edebug 8,2>ex mod_handle_logfile\no_logfile
      Loading symbols for module "MOD_HANDLE_LOGFILE".
       NO_LOGFILE: TRUE

    This means that TRICS had problem writing to its logfile at one time, and
    switched mode: to fly with NO logfile. But TCC does not do a full switch
    over, as it still tries to get its input files (init_auxi, reset,...) from
    the disk. 

    There was no particular rush today, so I tried to see if I could find a
    possible "emergency recovery" action (in case this happen in the middle of a
    run, and we don't want to loose scaler information). So I changed the
    variable holding the location of TRICS's command files.
    
      Edebug 8,2>dep mod_common_global_flags\boot_directory_name
        = '57.3"TRGUSER TRGGER"::TRGCUR:'

    I then told TCC to initialize, and it did well with the L1FW (lights
    flashed, and sptrg #31 appeared) but the initialization got in trouble when
    it reached the L1.5CT and the load from_local_disk command. The
    initialization seemed to hang, but was just slow, as it had to timeout each
    of the 12 EXE file OPEN.

    The disk loss probably occured around 6 am (last entry in MPOOL_SERVER.LOG)
    TCC didn't try reaching its disk (after giving up on writing logfiles) and
    there was no problems or symptoms for COOR, as long as no request for
    initialization, or begin/end run file was sent.
 
    Reboot TCC and everything looks back to normal. There is an access now.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   2-DEC-1994 Philippe:  

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*OCT94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG 
        - delete TRICS logfiles from October
        - delete MPOOL*.LOG, LOG*.LOG, MAIL*.LOG from October
        - Purge all files
        - [TRIGGER] now uses 57,000 blocks

    - also backup all TRICS logfiles from MAY, JUN, JUL, AUG, SEP, OCT 
      to MSUD02::DUA0:[TCC_LOG_IB]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  11-NOV-1994 Philippe:  
  
    - TCC was locked up this morning:

      TRGMON said: 0.00 Hz, Paused, Stale, with the expected TWB's (Paused,
      Stale, VME saw no L1 activity).
      DIR D0HTCC::[TRIGGER]*.* gave a normal response
      READLOG LAST 100 BLOCKS got as far as "asking TRICS to flush records to
      file" but no further.

These are the tail ends of the TRICS and LOG_SERVER logfiles:

S-15C/HDL% Preparing Params for L1.5 CT Crate                                                       %% time: 11-NOV-1994 09:23:24.53
S-15C/HDL% Copying Params to L1.5 CT Crate                                                          %% time: 11-NOV-1994 09:23:24.60
S-EXC/MBX% Flush_to_File now Servicing Exception Mailbox                                            %% time: 11-NOV-1994 09:23:31.11
X-DSP/EXC%2203468%PAS-F-FILALRACT, file already active                                              %% time: 11-NOV-1994 09:23:30.85
X-DSP/EXC%Skipping                                                                                  %% time: 11-NOV-1994 09:23:30.85
S-EXC/MBX% Exception Mailbox now empty                                                              %% time: 11-NOV-1994 09:23:31.48
 TRICS V6.3   CLOSED LOGFILE, DUA0:[TRIGGER]TRICS_30OCT94.LOG                                       %% time: 11-NOV-1994 09:23:31.48
C-RCV/CH2%   1:26  %00000001     PHAT CLOSELOG                                                      %% time: 11-NOV-1994 09:42:47.84
 TRICS V6.3   CLOSED LOGFILE, DUA0:[TRIGGER]TRICS_30OCT94.LOG                                       %% time: 11-NOV-1994 09:43:31.59
--------------END-OF-LOG-FILE------------------

I-LOG/SRV% Log Server Closed DUA0:[TRIGGER]TRICS_30OCT94.LOG;                                       %% time: 10-NOV-1994 22:40:09.93
 TRICS V6.3   CLOSED LOGFILE, DUA0:[TRIGGER]LOG_SERVER_30OCT94.LOG                                  %% time: 10-NOV-1994 22:43:24.70
E-EXC/MBX% Message Mailbox is Full but Not Signaled                                                 %% time: 11-NOV-1994 09:23:31.02
S-EXC/MBX% Flush_to_File now Servicing Exception Mailbox                                            %% time: 11-NOV-1994 09:23:31.29
X-WAI/CNS%flush% Console Locked for 5s, Recover: Force Unlock                                       %% time: 11-NOV-1994 09:23:30.85
S-EXC/MBX% Exception Mailbox now empty                                                              %% time: 11-NOV-1994 09:23:31.54
 TRICS V6.3   CLOSED LOGFILE, DUA0:[TRIGGER]LOG_SERVER_30OCT94.LOG                                  %% time: 11-NOV-1994 09:23:31.54
--------------END-OF-LOG-FILE------------------

The last message in the MAIL_SERVER logfile was for sending mail about this
last initialize, and closing the logfile.

The last message in the MPOOL_SERVER logfile was from 7:04, which is a bit
old, but not unconceivable.

There are supposed to be 2 MPOOL_SERVER logfiles (for the 2 servers running
in parallel). I just realized that  there is a problem with this, because
they have the same name and just different revision numbers. The first one
(the old trgmon) only had a few, old records. There is no problem in
opening a new logfiles, but when the logfile is closed by the
flush_to_logfile process for that job, the file is later reopened by name,
and the 2 jobs fight for the same file. There might be long time gaps
between the time file is closed reopened again. One of the jobs loses when
it cannot open the file for write access and just gives up on writing to a
logfile. The other one goes on. It is probably a matter of luck that
decides who wins, with preference to the second server that starts with the
file already opened.

TRICS was in the process of servicing an initialization request from COOR.
The next message after "Copying Params to L1.5 CT Crate" would have been
"Starting L1.5 CT Crate", about 7s later. This is right when we see the
"file already active" message happened. It seems like TRICS was just trying
to put out this message to the screen, and this is what caused the file
access conflict. 

We see in the LOG_SERVER logfile that its Flush_to_File process (which
wakes up every 3mn to see if the logfile needs closing, if the console
stays locked for more than 5s, or if the mailbox needs servicing) had been
waiting for the console lock for more than 5s and forced it unlocked, which
was just at the time when TRICS was also waiting/ready to write its
"Starting L1.5 CT Crate" message. It sounds like some third job must have
just been in the process of writing to the screen at that time, which
caused the "file already active" error.  

Maybe that third process was interrupted halfway while writing to the
screen, and transfer was passed to the TRICS dispatcher process which was
then busy using the full CPU for  7s during which the console stayed
unlocked.  There is no trace of another message in other logfiles, but we
couldn't see one of the MPOOL_Servers.

This is only an hypothesis, but it would probably be wise to increase the
timeout on the flush_to_logfile process for waiting on a locked console
from 5 to 15 or 30s. Another possibility is that someone bumped the hold
screen button on the TCC keyboard.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  21-OCT-1994 Philippe:  

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*SEP94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
                  but couldn't do MSUD02::DUA0:[TCC_LOG_IB]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG from August 
        - delete TRICS logfiles from September
        - delete MPOOL*.LOG, LOG*.LOG, MAIL*.LOG from July 
            but keep ones from September until next month (deleted 21-OCT-1994)
        - Purge all files
        - [TRIGGER] now uses 53,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  19-OCT-1994 Philippe:  

    - build New System EWORK1:TRICS_V63.SYS_19OCT94;1, loaded
      has the "old MENU" upgraded to address any of the 4 CBUSs, 
      in particualr the items #1 write, #2 read and #13 write register step

    - build New System EWORK1:TRICS_V63.SYS_19OCT94;2, loaded
      fixed bug (was missing the upgraded MOD059_MENU_IO_HANDLING.PAS)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  13-14-OCT-1994 Philippe:  

    - build New System EWORK1:TRICS_V63.SYS_13OCT94;1 
      add messages PHAT READHIST and PHAT SHOW_REG
      the READHIST message calls a special CBUS cycle that does multiple reads
      on the same register address. The address is selected, and the data is
      read multiple times, a few microseconds apart, and histogrammed. 

    - build New System EWORK1:TRICS_V63.SYS_14OCT94;1 
      fix (forgot to zero histogram)

    - build New System EWORK1:TRICS_V63.SYS_14OCT94;2
      remove 9999 limit on histogram sample size
    
    - This system was left in.
    - Backup Directory MSU::EWORK1: was updated

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  10-12-OCT-1994 Philippe:  

    - build New System EWORK1:TRICS_V63.SYS_10OCT94;1 (never loaded)
      fix problem of interference with L1 FW
      add logic to cope with the clipped coverage at eta>16 
      and missing T1 EM and HD CAT2

    - build New System EWORK1:TRICS_V63.SYS_12OCT94;1 (never loaded)
      deselect the MBA at the end of a CBUS cycle

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   6-7-OCT-1994 Philippe:  

    - Build New System EWORK1:TRICS_V63.SYS_6OCT94;1
      This version better incorporates the large tile tests

    - cf. D0_HALL_LOGBOOK.LBK for details.

    - Build New System EWORK1:TRICS_V63.SYS_6OCT94;2
      This version reads the andor IMLROs twice in a row, and reads the other
      andor backplane for comparison.

    - Bug in diagnostics code interferes with L1 FW, 
      thus reload ETRICS:TRICS_V62.SYS_7SEP94

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  23-SEP-1994 Philippe:  

    - Move TRICS V6.2 from EWORK1: to ETRICS:

    - Build New System V6.3 in EWORK1: TRICS_V63.SYS_23SEP94
      (new additions made to random test to check the loarge tiles up to the
      andor network). 
      This version also has additional "progress report" messages in CHTCR and
      CTFE lookup PROM tests to show wich PROM is being checked.
      the CTFE PROM test spends 56.15 s/page for 64 towers (but the remote
      console was on [at MSU!], and this may slow things down) which is about 20
      % slower (was 23.7 s/page for 32 towers). This test should be redone
      without the remote console to decide if this progress report is worth the
      slow down.

    - cf. D0_HALL_LOGBOOK.LBK for details.

    - V6.3 diagnostics code is not yet stable/useful.
      thus reload ETRICS:TRICS_V62.SYS_7SEP94

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  22-SEP-1994 Philippe:  

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*AUG94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
                  but couldn't do MSUD02::DUA0:[TCC_LOG_IB]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG from August 
        - delete TRICS logfiles from August
        - delete MPOOL*.LOG, LOG*.LOG, MAIL*.LOG from July 
            but keep ones from August until next month (deleted 25-AUG-1994)
        - Purge all files
        - [TRIGGER] now uses 38,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   7-SEP-1994 Philippe:  

    - New system is EWORK1:TRICS_V62.SYS_7SEP94;1
    1) TCC overwrites the 68k Dual Port Mem with %XFF at Load code & start crate
    2) TRICS expects to see all 32 L1.5 FW Terms

    - backup new files in DZERO::EWORK1: to MSU::EWORK1:

    - system loaded 8-SEP-1994
   
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  24-AUG-1994 Philippe:  

    - New system is EWORK1:TRICS_V62.SYS_24AUG94;1
        Allow Com File Code to properly service re-entrant requests
        Send "alert" mail msg to MSU whenever TRICS answers "BAD" to COOR INIT
      17:44 - Dan loads system, which complains about bad parameters...
      17:59 - Dan returns to system from 19-AUG 

    - New system is EWORK1:TRICS_V62.SYS_24AUG94;2
        SITE_DEPENDENT.CST had "lost" the L1.5 CT upgrade to allow 4 terms.
      22:18 - Dan loads system again, ok now.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  19-AUG-1994 Philippe:  

    - New system is EWORK1:TRICS_V62.SYS_19AUG94;1
        Four L1.5 CT terms are allowed [0..3]
        (raise the maximum L1.5 CT Term Number from 0 to 3 in SITE_DEPENDENT.CST)
      never meant to be loaded

    - New system is EWORK1:TRICS_V62.SYS_19AUG94;2
        Execute L15CT_DEFAULT_CONFIG.DAT as last step of COOR's LOADCODE msg 
        Implement keyword FROM_LOCAL_DISK for msg LOADCODE to load executables
            from local disk D0HTCC::[L15CT$EXEC]. 
      11:46 - Dan Loads system and quickly notices problem.

    - New system is EWORK1:TRICS_V62.SYS_19AUG94;3
        remove (third time) overwriting of 68k Dual Port Memory
      12:04 - Dan loads system

    - New system is EWORK1:TRICS_V62.SYS_19AUG94;4
        Philippe noticed that bug was re-introduced because it wasn't 
          propagated back to MSU on 6-AUG: missing return status from 3 routines
          in L1.5 CT that cause errors at initialize in TRICS_INIT_AUXI_L15CT
      14:04 - Dan loads system, now ok

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   11-AUG-1994 Philippe:  

    - 20:27 - Dan loads EWORK1:TRICS_V62.SYS_10AUG94;1

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  10-AUG-1994 Philippe:  

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*JUN94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
           and TRICS*JUL94.LOG 
                  but couldn't do MSUD02::DUA0:[TCC_LOG_IB]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG from June or July
        - delete TRICS logfiles from June and July
        - delete MPOOL*.LOG, LOG*.LOG, MAIL*.LOG from June, 
            but keep ones from July until next month (deleted 25-AUG-1994)
        - Purge LSM_ZEBRA.LOG

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   10-AUG-1994 Philippe:  

    - 12:36 - Dan loads EWORK1:TRICS_V62.SYS_6AUG94;1 and notices problem:
        TCC is overwriting again 68k Dual Port Memory, but 68k not yet upgraded.

    - 14:07 - Dan returns to EWORK1:TRICS_V61.SYS_27JUL94

    - New system EWORK1:TRICS_V62.SYS_10AUG94
        remove overwrinting of 68k dual port memory
      Not loaded until 11-AUG

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   6-AUG-1994 Philippe:  

    - Dan had L1.5 CT initialization messages in TRICS_INIT_AUXI_L15CT.DAT and
      noticed errors when executing "Preparing Params", and "Copying Params".
      The error only occurs from the command file, not from TRICS_ACCESS or
      COOR. The problem was in status argument in a couple routines that wasn't
      explicitly written, and kept whatever random value. The random values
      were ok for outside messages, not for command file messages.

    - New system is EWORK1:TRICS_V62.SYS_6AUG94
        Fix (missing) return status from Preparing and Copying Params to L15CT 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   5-AUG-1994 Philippe:  

    - Archive TRICS V6.0 to MSUD01::DUA1:[ARCHIVE.TRICS_V60]
        done from MSUD01:: using 000_ARCHIVE.V60

    - Save TRICS V6.1 from DZERO::EWORK1 to DZERO::ETRICS 
        done by swapping directory file names

    - Backup V6.1 from DZERO::ETRICS to MSU::ETRICS 
        done from MSU $ @COPY_TRICS ALL DZERO::[TRG_TARGET.SOURCE_TRICS] ETRICS:

    - Build TRICS V6.2 in EWORK1: 
        starting from V6.1 files from ETRICS
        run @CHANGE_VERSION_NUMBER.com   6.1 -> 6.2
        copy files 
            MOD071_DEF_HARDWARE_TABLES.PAS  (add ERPB-MTG)
            MOD123_INIT_CBUS_CARDS.PAS      (add ERPB-MTG)
            MOD171_PARSE_GLOBAL.PAS         (use OTS$CVT_T_F)
            SITE_DEPENDENT.CST*             (new V6.2, fix scaler_recover_dir)
            TRICS_V62.PAS  to remove the kludge from 27-JUL-1994
            - note that the kludge in MOD100_HANDLE_L15CT.PAS to skip painting
              FF's in 68k dual port memory is sill there (WRONG! see 10-AUG)
        $ MMS/SKIP
      New system is EWORK1:TRICS_V62.SYS_5AUG94

    - Backup V6.2 from DZERO::EWORK1 to MSU::EWORK1
        done from MSU $ @COPY_TRICS ALL DZERO::[TRG_TARGET.SOURCE_1WORK] EWORK1:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  27-JUL-1994 Philippe:  

    - Temporarily modify MOD100_HANDLE_L15CT.PAS, skip painting FF in the 68k
      dual port. The proper long term action needs to be thought of.
    - Temporarily change TRICS_V61.PAS to add a ":" between LOGGER$BRD and
    TCC_BOOT_ddmmmyy.INFO

    - There is a new system in EWORK1:TRICS_V61.SYS_27JUL94;2, 
        older V6 systems were deleted

    - This new system file was successfully loaded 

    - These temporary kludges were NOT copied to MSU

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  27-JUL-1994 Philippe:  

    - Add code to display L1.5CT 68k cycle scalers (big N, little n,...). 
      There is a message to read the scalers any time: L15CTSYS 68K_CNT CRATE(0)
      This is also executed automatically at Init and Load code, along with 68k
      errors and 68k flags.

    - Update L1.5CT 68K errors display with the latest addition: count of byte
      misalignment errors in Object Lists.  

    - There is a new system in EWORK1:TRICS_V61.SYS_27JUL94;1
    - Backup V6.0 DZERO::ETRICS: to MSU::ETRICS: (using MSU::EWORK1: from 7-JUL)
    - Backup V6.1 DZERO::EWORK1: to MSU::EWORK1:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  26-JUL-1994 Philippe:  

    - Fixed 2 reasons why the "watch double buffer" process was not kicking in
      (that gave the 50% chance to have read/write_A_B start off wrong).
    - Also fixed an alignment problem in the shared memory space (that was
      causing all weird boot status, wake up words,...).

    - There is a new system in EWORK1:TRICS_V61.SYS_26JUL94 (never loadded)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  22-JUL-1994 Philippe:  

    - Dan Loads new code TRICS_V61.SYS_21JUL94
        - Problem with "watch double buffer" task obviously not doing its job.
        - Problem with DSP status 

    - Return to old V5.3 code 

    - L1C coverage was "clipped down" to eta = 3.2 (TT_Eta=16)
    Global Missing Et
    Global EM Et  \
    Global HD Et  /  and thus Global Total Et

    Tower Counts and Large Tiles are still using full coverage

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  21-JUL-1994 Philippe:  

   1) Move to Version V6.1

   2) Add repeating screen message at boot time in case of pQBA or pVBA hardware
      problem detected at boot time (e.g. power off).

   3) At Initialize, the ZRL cards are reset, and reloaded. The DRV11Js too.

   4) At boot time, clarify the various screen messages related to
      ZRL/pVBA/pQBA/DRV 

   5) At initialize the 68k is now "parked" before the DSPs are reset 

   6) At initialize, and at loadcode,  the 68k error counters and flags are read
      and put in the logfile. (Your latest COM_PORT error counter isn't in yet).

   7) Execute file TRICS_INIT_AUXI_L15CT.DAT right after TRICS_INIT_AUXI.DAT  
      This is done at boot time, and at initialize
      (note: missing file is not fatal)
      Dan, please start a new file with the ERPB MTG stuff at your convenience.

   8) add messages 
          L15CTSYS DSP_STAT  check all DSP status                              
          L15CTSYS 68K_CTRL  check 68k control words (wake up + transfer words)
          L15CTSYS 68K_STAT  check 68k status                                  
          L15CTSYS  68K_ERR  check 68k run-time error counters                 
          L15CTSYS 68K_FLAG  Check 68k software flags                          
          L15CTSYS VER_TSEL  Verify Term Select Paddle Board Memory            

      These messages aren't in TRICS_ACCESS yet. Use:
        $ @ EENV:COMMANDS
        $ SEND_TRICS L15CTSYS DSP_STAT CRATE(0)
      Note that they ALL need the CRATE argument.

   9) add message PHAT READPVBA to read pVBA control registers.

  10) L1.5 CT Crate_ID and Term_Num are range checked against the current
      implementation (must be 0 and 0)

  11) remove integer decoding in error message  at "start L1.5CT" of
      local/global/frame 'param out of range' messages, display Hex only. (TCC
      doesn't know the data type)

  12) Fix bug, there were two CLOSE(initfile) statements in INIT_AUXI service.
      There is a small chance that this could have been causing our
      TCC/Disk hangup problems.

  13) Read all scalers, as a recovery for end run/store in case of TCC
      crash/reboot.
      Done After initializing the ZRL, DRV11j...
           After Boot_Auxi because this is where the scaler list is defined
           But before any register is initialized.
      File is LOGGER$BRD:TCC_BOOT_ddmmmyy.INFO

    - There is a new system in EWORK1:TRICS_V61.SYS_21JUL94 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  20-JUL-1994 Philippe:  

    - Message from Oscar Ramirez about a problem during the PR_TRIG run summary
      command file that calls TRGMON

Hi,

I don't know if this is important but in any way I want to send you the
following trace back of the PR_TRIG COMMAND for the TRIGMON DISPLAY.
when I pick up the print out there was only a couple of sheets that came from
the file SUPDUMP.TXT and the triggers bits info is missing.

                                          
                                         DJOKO AND RAMIREZ
                                           THE SHIFTERS

P.S. After try again everything seems to be O.K. ie, I got the trgmon_dump.txt
file printed


>>> Doing TRIGGERS

%SYSTEM-F-FILNOTACC, file not accessed on channel
%TRACE-F-NOMSG, Message number 0009804C
module name     routine name                     line       rel PC    abs PC

MESSAGE_XFER    RECORD_CALL                       527      00000190  000035BC
                                                           80892B55  80892B55
CONNECT         ITC_DISCONNECT                   1033      000000C1  00000FAD
                                                           00000850  00000850

    - reply is:

  I don't know what happened, this looks to me like an ITC link problem. 
  There is not much I can do right now, especially if this was an isolated 
problem. I will enter this information in our records for further reference.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  12-JUL-1994 Philippe:  

    - TCC became messed up again, in a manner similar to
      the time (25-MAY-1994) when there was network problems, when many level 2
      nodes got hung. I don't know what caused today's problem. 
      TCC could no longer access its disk, with the same weird error messages.

      I tried changing the boot directory to using the host, on the fly, from
      the begin_end_run task
      Edebug 8,12>dep mod_common_global_flags\boot_directory_name
        = '57.3"TRGUSER TRGGER"::TRGCUR:'

      But the begin_end_run task was stuck in the CLOSE statement in
      close_auxi_file for the file 'TRICS_FORCE_BUF_UPDATE.DAT', and I couldn't
      get it un-stuck anyway.

      (also notice a bug that the close statement appears twice in the routine,
      and that bug has been there since beginning of the service, but the task
      was stuck in the first call, not the second).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   7-JUL-1994 Philippe:  

    - DZERO::EWORK2: moved to DZERO::ETRICS (see new policy on top of file)
    - Backed up V6.0 from DZERO::EWORK1: to MSU::EWORK1:
    - Note that V5.3 DZERO::ETRICS: was not backed up to MSU:: because it is
      obsolete, and EWORK1: is already stable.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  28-JUN-1994 Philippe: 

    - Problems from last week (below) were due to a bug in the ZRL Interrupt
      Service Routine MACRO Dual_ISR, where the register R5 was not saved.

    New system is ework1:TRICS_V60.SYS_28JUN94;1

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  27-JUN-1994 Philippe:

    - Send message to Norm Amos on how to recover most of the luminosity data
      after a TCC reboot during a store.

From:   MSUPA::LAURENS      "Philippe"   27-JUN-1994 10:56:35.25
To: FNAL::AMOS
CC: FNBIT::D0::JGUIDA,laurens,edmunds
Subj:   RE: recovering lost luminosity info


Norm,

   We had TCC problems during a store last week, here is what I think is the 
method that would recover the most information. I don't expect you will find 
anything surprising here, and you might have already figured it out for 
yourself in earlier instances of TCC crash.

   For the missing end of run file (the run during which TCC rebooted), I 
would simply take the last pause/resume run file from before the crash, and 
maybe extrapolate for the actual end of the run.

   The other half of the problem is that all scalers have been reset between 
the begin and end of store. I would simply take the end of store file (which 
is from after the reboot) and correct each scaler by adding the values from 
the last pause/resume file (this is from right before the reboot).

   Philippe

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  24-JUN-1994 Philippe: another crash, but now in user process.

 Job 10, process 1, program MPOOL_SERVER raised exception.
%SYSTEM-F-ACCVIO, access violation, reason mask=04, virtual address=00000004,
PC=00003C9E, PSL=03C000B4
  Module  MOD_MPOOL_SERVER
  1612:       FOR vetonum := 0 TO 6
>>1613:       DO base_address^.sptrgcnt[sptrgnum].stvetos[vetonum] :=
  1614:           mpool_rec^.current.sptrgcnt[sptrgnum].stvetos[vetonum]
  1615:         - mpool_rec^.previous.sptrgcnt[sptrgnum].stvetos[vetonum] ;
  1616:
  1617:       base_address^.sptrgcnt[sptrgnum].L15_incr.st_confirm :=

--Edebug 10,1>sho call
 Module name     Routine or Psect name           Line     Rel PC   Abs PC
 MOD_MPOOL_SERVERREAD_MPOOL_ST_GS                1613     0000016C 00003C9E
 MOD_MPOOL_SERVERSERVICE_REQUEST                 1357     000000DD 00002E65
 MOD_MPOOL_SERVERMPOOL_SERVER                    1286     00000459 00002C84
                                                          00000000 800049A7
Edebug 10,1>ex vetonum
 VETONUM:  2 (00000002)
Edebug 10,1>ex/inst %Line 1612
 %Line 1612 + 0000: MOVL   #00,-08(FP)
Edebug 10,1>e/i
 %Line 1613 + 0000: MULL3  #00000060,-0C(FP),R3
Edebug 10,1>e/i
 %Line 1613 + 0009: MULL3  #04,-08(FP),R2
Edebug 10,1>e/i
 %Line 1613 + 000E: ADDL2  R2,R3
Edebug 10,1>e/i
 %Line 1613 + 0011: MOVL   -10(FP),R2
Edebug 10,1>e/i
 %Line 1613 + 0015: MOVAB  0B0C(R2)[R3],R5
Edebug 10,1>e/i
 %Line 1613 + 001B: MULL3  #00000050,-0C(FP),R3

Edebug 10,1>ex r5
 R5:  4 (00000004)
Edebug 10,1>ex r3
 R3:  2328 (00000918)
Edebug 10,1>ex r2
 R2:  0 (00000000)
Edebug 10,1>ex fp
 FP:  2147480072 (7FFFF208)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 22-23-JUN-1994 Philippe: 

    - Install new Double Port "Stereo" ZRL VAXStation 4000, with new "baby" VNE
      backplane in BA23 enclosure.
      New TCC first powered at 14:00, Booted with new Code for Dual Interface,
      and with the new store at 14:42 22-jun-94

      TCC crash to <Kernel Edebug> at about 22:28 with an Access violation 

800066DC    MTPR 50(R5),#10     (Move To Processor Register)
            LDPCTX              (Load Process Context)
            REI                 (Return from Interrupt)

    This Absolute PC address is part of the shareable image KER$SCHEDULE_JOB 
    (as listed in ELN$:4NNKER.MAP)

With        R5 = 00000004   this is not a legal address.

Call Stack:     Absolute PC

                800066DC
                7FFFFD4C
                7FFFFDBC

   R0                General register 0
   .                   .
   R11               General register 11
   R12 or AP         General register 12 or argument pointer. 
   FP                Frame pointer
   SP                Stack pointer
   PC                Program counter


    - TCC is rebooted, crashes again at about 16:10 23-JUN-94 
      again with an Access violation 

 800060EF       MOVZBL  0A(R1),14(R5)     (Move Zero-Extended Byte to LongWord)

    This is part of the shareable image KER$UNWAIT

With        R1 = 80441680
            R5 = 00000004   this is not a legal address, and by coincidence (?)
                            it is the same processor register, and the same 
                            bad content.
 
Call Stack now
        800060EF
        7FFFFB84
        7FFFFBDC
        7FFFFC34
        7FFFFC98
        7FFFFD28
        7FFFFD7C
        7FFFFDBC

    - Boot the old code in the new dual ZRL interface box, to try to get an 
      idea if this is a hardware problem with the dual box, or a software 
      problem with the new code.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 13-JUN-1994 Philippe: 

    - Investigate cause of 
        "Error during Reset COMINT file"
                    11-JUN-1994 18:18:34.48 
                                18:19:38.61   
                                18:20:43.17   
                                18:21:47.47   
                                18:22:51.92   
                                18:23:56.41   
                                18:25:00.77   
                                18:26:05.23   
        Then Initialize 
                    12-JUN-1994 08:19:38.60
                                09:13:01.68
                                09:31:15.32
        and Boot at 
                    12-JUN-1994 09:36:27.23 

      These messages were generated when TRICS was not able to access the disk
     (RESET COMINT is in TRICS_RESET_DIRECTIVES.DAT)
      The last disk accesses were the regular flush buffer to file, at 
        LOG_SERVER_26MAY94.LOG;2             4/10      31-MAY-1994 09:05:19.00  
        MAIL_SERVER_26MAY94.LOG;2           57/60      11-JUN-1994 12:05:30.00  
        MPOOL_SERVER_26MAY94.LOG;2        4506/4510    11-JUN-1994 14:29:33.00  
        TRICS_26MAY94.LOG;2              25721/25730   11-JUN-1994 13:43:34.00  
      
    I imagine we had either a disk failure that caused the ELN disk driver job
to quit (remember that ELN has no recovery ability). Or a re-occurence of a
system/network problem like on 25-MAY-1994 evening.

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*MAY94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
                           and to MSUD02::DUA0:[TCC_LOG_IB]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG 
        - delete ALL [TRIGGER] logfiles from May

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  9-JUN-1994 Philippe: 

    - MSUD02::DUA0:[TCC_LOG_IA] and MSUD02::DUA1:[TCC_LOG_IA] have the logfiles
      from run Ia. They no longer are on MSUD01::
    
    - MSUD02::DUA0:[TCC_LOG_IB] has a copy of MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
      It must be updated at the same time.

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*APR94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
                           and to MSUD02::DUA0:[TCC_LOG_IB]
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG
        - delete ALL [TRIGGER] logfiles from April
        - [TRIGGER] now uses 68,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 31-MAY-1994 Philippe: 

    - check logfile for source of MAIL error message 
Date:   30-MAY-1994 12:24:47.45
Subj:   TRICS V5.3/26-MAY-1994/ 1 Errors Initializing Framework Registers       

    TRICS_26MAY94.LOG had 
%% time: 30-MAY-1994 11:25:55.77
S-INI/HDB%COORini% Initializing All Framework Registers                         
E-HIO/HDB% Failure Writing  15 @ cbus 2 mba 129 ca 16 fa   3 read  11           
    then following in the same initialize,
S-INI/ODB%COORini% Initializing all Specific Triggers                           
E-HIO/HDB%COORini% Previously  15 instead of  11 @ cbus 2 mba 129 ca 16 fa   3  
    This is an ANDOR card that has shown this behavior for a long time. This is
probably nothing to panic about, as the correct value was probably programmed,
and only the immediate read back failed.

    But also next initialize had no error at init-all-fw-reg, but 
%% time: 30-MAY-1994 14:40:48.12
S-INI/ODB%COORini% Initializing all Specific Triggers                           
E-HIO/HDB%COORini% Previously  11 instead of  15 @ cbus 2 mba 129 ca  4 fa   9  
    Hopefully this was only a read back problem, and the proper value was
stored and not lost.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  26-MAY-1994 Philippe:

    
    Jan called this morning, here are the reconstructed facts:

    1) 15mn after Dan left last last night, we had a water drip trip. 
    2) They were probably not using it because they only called this morning.
    3) Jan was there, reseting the RMI and the RPSS worked. 
    4) From TRICS logfiles, it was clear there was a global communication
       problem, and symptoms with assistant CBUS and every write reads zero.
    5) I had her check there was power and LED's in M114. There was only few
       LEDs in M102,M103, and no Beam X LED's
    6) asked Jan to turn off the BA23 and the 4000, count to 60 and turn them
       back on.
    7) Everything is back to normal.

    This is the "standard" way the 4000/ZRL can get hosed: having M114 powered
off for extended periods of time.

    - In the process, I noticed error messages at the beginning of the
logfile from this morning and from last night (i.e. the new system) that look
like TRICS failed all the BOOT_AUXI messages defining the end of run scalers.
So I took the oportunity to restore the earlier system. It is not clear to me
what went wrong, as I changed nothing that could do this (that's what I
thought, but obviously, I am wrong).

    8-JUN-1994 update: Yes, I was wrong.
    the bug was in MOD171_PARSE_GLOBAL.PAS, see TRICS.LBK

    24-JUN-1994 update: 
    in EWORK2:
    delete the files
            DESCRIP.MMS
            MOD171_PARSE_GLOBAL.PAS 
            MOD247_L15CT_DISPATCH.PAS
            MOD263_SOFT_CONN_DISPATCH.PAS
            TRICS_V53.EXE
        and TRICS_V53.SYS_12MAY94 
    restore correct versions of 
            MOD171_PARSE_GLOBAL.PAS 
        and MOD263_SOFT_CONN_DISPATCH.PAS
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  25-MAY-1994 Philippe:

    - Coor had a time out waiting for an  Acknowledge  from TCC at end of run
time. Read log gets the following:
Looking for Current Log File
%DIRECT-E-OPENIN, error opening D0HTCC::DUA0:[TRIGGER]TRICS_*.LOG; as input
-RMS-F-NET, network operation failed at remote node; DAP code = 01F77C54

$ dirs  d0htcc::dua0:[trigger]
%DIRECT-E-OPENIN, error opening D0HTCC::DUA0:[TRIGGER]*.*;* as input
-RMS-F-NET, network operation failed at remote node; DAP code = 01F77C54

TRICS can talk to TCC OK   e.g.  Dan can do a read reg ok.

    Philippe didn't learn anything, and has no idea of what is going on.
Philippe doesn't believe it is a problem with our disk. None of TRICS'
process/jobs seems to be in trouble, including the disk driver.

    From Set host TCC, ECL> DIR complained about "network timeout". 
What it could possibly need the network for (name service ?!?)

    Jan says that 12  L2 nodes have died with some strange network problem. 
This remains a mistery.

    - Later on, Dan slides in new system that accepts L1.5 CalTrig messages 
      (but takes no action)   
      EWORK2:TRICS_V53.SYS_12MAY94

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  12-MAY-1994 Philippe: 

    - build new system that accepts L1.5 CalTrig messages (but takes no action) 
        new files are 
            MOD171_PARSE_GLOBAL.PAS (to accept more than 8 consecutive blanks)
            MOD263_SOFT_CONN_DISPATCH.PAS \ to admit L15CT messages
            MOD247_L15CT_DISPATCH.PAS     /
        and DESCRIP.MMS
      new system is EWORK2:TRICS_V53.SYS_12MAY94

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  22-APR-1994 Philippe: 

    - Clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*FEB94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
        - copy TRICS*MAR94.LOG to MSUD01::DUA1:[BACKUP.TCC_LOG_Ib]
        - files TRICS*JAN94.LOG and TRICS*DEC93.LOG had already been copied 
        - did NOT copy MPOOL*.LOG, LOG*.LOG, MAIL*.LOG
        - delete ALL logfiles from before April
        - [TRIGGER] now uses 48,200 blocks

    - Area MSUD01::DUA1:[BACKUP.LOGFILES_D0HALL] has been deleted.
      all files from run Ia have been copied to disks D0MSU2$DUA0: and 
      D0MSU2$DUA1: in area [TCC_LOG_Ia] for archival.

    - From now on, all new run Ib files copied to
      MSUD01::DUA1:[BACKUP.TCC_LOG_Ib] are also copied to D0MSU2$DUA0: and
      D0MSU2$DUA1: in area [TCC_LOG_Ib] for backup. 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   28-MAR-1994 Philippe: 

    Control Room calls, they can no longer create new TRGMON windows
    It seems that it is only the TRGMON server that is in trouble, with some
    kind of ITC problem. 
    In order to learn more, I use edebug to see what D0HTCC is doing.  
    TRGMON service is back to 100 % normal. I restarted the separate job
    that serves the monitoring information to TRMGON.
    This will not have any effect on the data taken in this run, nor will it
    affect the end of run luminosity information.
    TCC had simply reached the maximum number of 15 allowed ITC connections. 
    However I don't think this was real, that is I don't think there were 15
    TRGMON running at the same time. I am suspicious that ITC is "sometimes"
    forgetting to notice that a channel has been fully released and is available
    for re-use -- Maybe this happens when a host node crashes, as we  had a
    few times last week -- I will investigate some more tomorrow morning.

    While investigating this problem, I noticed weird logfile content, and the
    bug that is causing it:
    the logfile shows a series (every 5s) of Channel #11 ... (ReConnecting) 
    the variable message_cnt is declared 1..10 while the maximum number of ITC
    channels is now set to 15 in [.itc.inc]ITC_CONFIG.INC.
    I don't think this is related to the current problem, but this was clearly
    overwriting something else...

    Update 1-APR-1994 : investigating in the old TCC Mpool_server logfiles, 
    the time where TCC/ITC started loosing ITC connections coincides with the
    Edebug session where I located the source of the MP_FOREIGN integer
    overflow. This most likely is simply what triggered it.

    Also note that the new logfile started 28-mar shows no sign of any channel
    being lost. There doesn't seem to be in intrinsic problem here.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  25-MAR-1994 Philippe: 

    - Notice that the remote console displays some Mpool_server integer
      overflow message during the MP_foreign request.
      use Edebug to capture a few sets of mpool_rec^.current.foreign.scaler and
      mpool_rec^.previous.foreign.scaler to understand where this is coming
      from. It is simply that the first 8 scalers are not plugged in and read
      unpredictably.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  18-FEB-1994 Philippe: 

    - load new code that solves the problem "E-MPL/STL% Pilot captured Spy, but
      not Assistant" 
      File EWORK2:TRICS_V53.SYS_18FEB94

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  17-FEB-1994 Philippe: 

    - Load new code where the prom test reads global counts as signed integers 
      and tracks tier#1 truncation.
      File EWORK2:TRICS_V53.SYS_17FEB94
      Run PROM tests on all PROMS, all pages
      Run over 5 Mega Loops of random test

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  10-FEB-1994 Philippe: 

    - The directory ETRICS at Dzero is now officially used to create a copy of
      the last stable code. I will be using it to back up the EWORKn directory
      before I start modifying the code for the next version of TRICS at Fermi.
        -> Etrics will keep the n-1 version of the TCC code and system.

    - The directory ETRICS at MSU will now officialy be used as a backup of the
      latest code and system from DZero.
        - I will also keep a subdirectory [.tcc] for the files needed on TCC
        - and a subdirectory [.trguser] for all the important files of the
        trguser account at DZero. (This is not a redundancy with respect to
        [trg_current.dzero], as the appropriate files will be archived with the 
        corresponding version of TRICS)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3-FEB-1994 Philippe: 

    - Install dual COMINT system. 
      This is TRICS Version V5.3, and the code at DZero is in directory EWORK2:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2-FEB-1994 Philippe: 

  - check TRICS_23JAN94.LOG logfile for source of mail message. 
    One error while initializing framework.
        Failure Writing  15 @ cbus 0 mba 129 ca 11 fa  12 read  11  
    but later 
        Previously  15 instead of  11 @ cbus 0 mba 129 ca 11 fa  12 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 23-JAN-1994 Philippe: 

    - Control Room (Mike T.) calls: TCC unreachable from COOR nor TRGMON.
      Use set host/log to run edebug: 
        many processes were stuck while writing to the console 
        and 2 of the processes had noticed the system ran out of Pool.
            %PAS-F-ERRDURPUT, error during PUT
                -KERNEL-F-NO_POOL, no pool available

    - The system parameter "Pool Blocks" was down at 2048, I don't know why.
      I made a new system with 10 k blocks. There is obviously still something
      consuming this resource.
      $ COPY DZERO::EWORK1:TRICS_V52.DAT MSU::EWORK3:*.DAT_DZERO

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 14-JAN-1994 Philippe: 

    - modify fill_monit_pool to correct for the offset in Tier#3 energies copied
      into the data block.
      file name TRICS_V52.SYS_14JAN94. File loaded.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 13-JAN-1994 Philippe: 

    - we had another occurence of ZRL/pQBA being screwed up without notifying
      TCC. This happen after a power glitch that tripped Level 1.

    - make and load same system without Name Server enable
      file TRICS_V52.SYS_13JAN94

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 12-JAN-1994 Philippe: 

    - Inspect logfile TRICS_08JAN94.LOG (up to 12-JAN-1994 11:11:42.07)
      Notice many occurances of incorrect prescaler messages from ELNCON Ch#2,
      need to check with Dan to see who sent them
      one occurence of bad register readout during sptrg initialization 
        Previously  11 instead of  15 @ cbus 0 mba 129 ca  1 fa  28
      
    - Inspect logfiles
        TRICS_02JAN94.LOG;1 (includes the pQBA problem noted earlier)
                            (No pQBA device error was logged at the initial
                             problem time. But the end of the logfile shows what
                             probably was a manual power-cycling of the BA23.
                             The logfile shows that the pQBA device was woken up
                             and that TRICS reset it. There is also a successful
                             write afterwards)  
        TRICS_03JAN94.LOG;1
        TRICS_06JAN94.LOG;2
        TRICS_06JAN94.LOG;1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ..-JAN-1994 Philippe, while I was sick

    - There has been one occurrence where TRICS would systematically read zeroes
      back from the CalTrig (through the pQBA). Power cycling both boxes cured
      it. Re-triggering TCC was probably enough. There had been (it is believed)
      a power glitch in the HV racks, and TCC/BA23 were still taking power from
      these racks at the time.

    - There is a problem in Level 2 that, when started, kills the Level 2 nodes
      (and probably TCC) as they pass a corrupted table of the name server
      around each other. All nodes need to be rebooted then.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 28-DEC-1993 Philippe: 

    - inspect logfiles for closer analysis of earlier tests
        TRICS_21DEC93.LOG;2
        TRICS_21DEC93.LOG;1
        TRICS_22DEC93.LOG;1
      cf. D0_HALL_LOGBOOK.LBK 

    - also look in TRICS_22DEC93.LOG;2 but the file is not closed
        one init error instance of 
    %% time: 25-DEC-1993 01:31:12.53
    Previously  11 instead of  15 @ cbus 0 mba 129 ca  7 fa  28
    (cont) 00001011 i/of 00001111 Msk= W 11111111 R 11111111, Writing 240

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 20-22-DEC-1993 Philippe: 

    - fix all known remaining problems in random tests 
        truncation needs to watch Px/Py sign

    - fix some of the ctfe prom test problems (signed Px/Py sums,
      initializing all tested cells between pages) but still some problems are
      left (reading signed results)


      cf. D0_HALL_LOGBOOK.LBK 
        - Temporary init error on one L15 digimem reg after radiator work
        - Is the eta 17..20 tier#2 wiring different?
        - Eta +/- 1..16 CHTCR PROMs were tested.
        - Eta +/- 1..20 Phi 1..5 CTFE PROMs tested
        - several Random Test Runs
        - temporary 66 Gev EMEt traced to tier#3
        - Run Find DAC on 17..20.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 16-DEC-1993 Philippe: 

    - We made a second run of Find DAC last night that wasn't interrupted by an
      INITIALIZE. The new values for 1..16 were loaded in INIT_DAC_BYTES.LSM
      The values for 17..20 were left unchanged

    - Update TRICS_INIT_AUXI.DAT 
      Change the location of the L0-L1 box from Tier#2 9..16 to Tier#2 17..20 
      (i.e. from mba 209 to 249) 

    - propagate above change to USER1:[TRGUSER.DIRECT_TO_TCC]FORCE_L0_FAST_Z.MSG
      and copy to msu::hepe:[TRGUSER.DIRECT_TO_TCC]

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 15-DEC-1993 Philippe: 

    - copy MSU::LSMP$DATA:NEW_D0HTCC_FILE_FOR_L15CT.LSO
      to D0HTCC::[TRIGGER]LOOKUP_SYSTEM_MANAGER.ZEB, and TRGCUR:
      move TEMP_D0HTCC_FILE_L15CT.LSO to [TRG_CURRENT.OBSOLETE]
      move and rename old TRGCUR:LOOKUP_SYSTEM_MANAGER.ZEB to
               [.OBSOLETE]LOOKUP_SYSTEM_MANAGER.ZEB_PRE_L15CT

    - modify TRICS_BOOT_AUXI.DAT
      Change method for shrinking eta coverage. 
      It is required by TRICS that the towers in INIT_DAC_BYTES.LSM exactly
      match the coverage currently defined. Failure to interpret the file will
      cause TRICS to keep the default value 10 as DAC_BYTE. Also any tower not
      defined in INIT_DAC_BYTES.LSM keeps 10 as DAC_BYTE. 
      This becomes inconvenient when we later want to restore a larger
      coverage.
      We now have a file with the pedestal for eta 1..20, but would like to
      limit the current coverage to 1..16.
      TRICS_BOOT_AUXI.DAT now first define full coverage (or whatever
      appropriate coverage for the INIT_DAC_BYTES.LSM), then loads
      INIT_DAC_BYTES.LSM, then defines the actual more limited coverage (if
      necessary).

    - restore the 2-DEC-93 version of INIT_DAC_BYTES.LSM

    - do trics ->Initialize.Trg.Twr and get 

C-RCV/CH2%   1:42  %00000009  INITIAL   TRGTWR  MAGN_ETA(1:16)                 
E-HIO/HDB%C09/1: Previously  34 instead of  10 @ cbus 0 mba 201 ca 55 fa   4   
E-HIO/HDB% (cont) 00100010 i/of 00001010 Msk= W 11111111 R 11111111, Writing  3
E-HIO/HDB%C09/1: Previously  41 instead of  45 @ cbus 1 mba 169 ca 21 fa  22   
E-HIO/HDB% (cont) 00101001 i/of 00101101 Msk= W 00111111 R 00111111, Writing  3
C-ACK/CH2%   1:44  %00000E6D   ACKNOW 00000009       OK     DONE               

    0 201 55 4  is CTFE (+9..12,12) ped dac control chan #3 i.e eta +11
    1 169 21 22 is CAT2 (+1..4,25..32) -Py comparator register #2 bit 1..6
    do it again, and get no complaint 

    - Fix TRICS tree offset computation and build new system.
      HD energy sums showed about -15 GeV. Also noticed that the correction
      loaded in Tier 3 were the same for EM Et, EM L2, HD Et, HD L2.


    - Fix TRICS Tree Browsing software to read CAT3 operands. 
      The problem was with a register address shift while reading CAT3 operands
      Use Tree browing to locate a problem 
            with tier #1 cat2 for eta -5..-8, phi 1..8      
                was 255 too low
                and read phi 7 input as 0
                card replaced, problem gone
             and tier #1 cat2 for eta -5..-8, phi 17..24
                was %X180 too high
                card replaced, problem gone

    - Try to detect a problem that seemed to make Global EMEt read 16 GeV more
      than EM L2, even when all towers were excluded and both lookups locked on
      page 4. We excluded all EM towers, and set a specific trigger to require
      250 MeV of EM Et. The Andor rate stayed at zero.

    - Extend this testing method to other quantities. Exclude all EM and HD
      trigger towers and verify that the andor rate stays at zero when one
      requires 
        250 MeV of Tot Et
        or any tower above an EM refset of 250 MeV
        or any tower above an Tot refset of 500 MeV


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 10-DEC-1993 Philippe: 

    - Notify John F and Jan G that new system is 4000 and now uses disk booting
    - John updates permanent databases

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  9-DEC-1993 Philippe: 

    - Move EWORK1 TRICS_V5.0 to ETRICS
    - Fill EWORK1 with TRICS V5.2 for 4000 M60 code
    - now use disk booting
        - Change INITIALIZE_TIME.PAS
          Write D0HSC address in KER$GQ_HOST_ADDRESS[2] at DZero for Disk Boot
        - Change MOD033_HANDLE_CONSOLE.PAS
          increase the wait for semaphore timeout to 15 seconds
        - Modify COM:TRIGGER_NODE.COM to use the the disk boot method everywhere
          (was doing this at MSU only). use [SYS0.SYSEXE]SYSBOOT.EXE 
        - Temporary Modify MOD052_TCS_IO_COMINT_HANDLING.PAS
          not to wait for Assistant COMINT on Pauses
    - build system TRICS_V52.SYS_9DEC93

    - New D0HTCC has ethernet hardware address 08-00-2B-34-EA-E5
      Old D0HTCC had ethernet hardware address 08-00-2B-07-04-85

    - Load a simple system with FAL only
        $ COPY EENV:INIT_DISK.SYS ESYS:TRGD0HTCC.SYS /OVER
        Note we do this by hand because I have modified the ELOAD command.

    - Turn uVAXII off, and 4000M60 on.
        >>> B ESA
        We don't let it boot from the disk, otherwise it would wake up as
        MSUD03, address 46.193

    - copy the environment files
       $ DELETE D0HTCC::[trigger]*.*.*
       $ COPY TRGCUR:*.* D0HTCC::[trigger]

    - copy the new system over
       $ CD EWORK1:
       $ ELOADHTCC TRICS_V52.SYS_9DEC93 y "New 4000 M60, TRICS V5.2, disk boot"

        I have switched the behavior of the ELOAD command, it now copies
        the system file to target::[SYS0.SYSEXE]SYSBOOT.EXE. 

        Note that the system files are now different. They don't have the same
        (any?) header like the ones for downloading.

    - Reboot. Using ncp trigger node is ok because the disk boot is the
       default method.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  9-DEC-1993 Philippe: 

    - uVAXII system locks up.
      use Edebug
        process flush_to_file error in display_console "file already active"
        message that was going to be displayed "mailbox full but not signaled"
        mailbox content was
            wai/cns%flush% console locked for 5s- recover   10:12:56
            wai/cns% console locked for 5s                  10:14:15
            wai/cns% console locked for 5s                  10:20:19
            file already active                             10:20:19
            skipping                                        10:20:19
        process dispatch was waiting for new message
        process refresh mpool was waiting in display console 
            with some frog blinking data
        channel #1 was waiting for display_console 
            with a pause message to process
        channel #2 was gone!
        channel #3 and #4 was not stoppable (i.e. waiting for elncon message)
        watch_double_buffer was waiting for its interrupt on port C
        begin/end_run was waiting for its cue.

    - Rebooting fails, initialize time is waiting inside the first WRITELN
    - Dan does a CTRL_Q, and the first message goes through, but things still
    screwed up, and initialize_time still waiting.
    - Reboot, and now everything ok.

    --> was it simply that someone (Dan) pushed the hold screen key?
    The message time was old, and it might have stayed un-noticed for a while

    --> Does it explain the weird symtoms from 23-NOV-1993?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3-DEC-1993 Philippe: 

    - Dan Loads the new system.
      There is also a funny D0HTCC::[TRIGGER]LOOKUP_SYSTEM_MANAGER.ZEB
        with eta 5:8 having old HD PROM's   
        and Dan edited  Boot-Auxi to tell it eta 1:8 should be used.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2-DEC-1993 Philippe: 

    - fix bug to readjust the Large Tile upper and lower range boundaries in
      the code for the message modifying the trigger tower range on the fly. 

    - build new system EWORK1:TRICS_V50.SYS_2DEC93

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 23-NOV-1993 Philippe: 

    - TCC crashed. Trace back shows MPOOL_SERVER with an error during screen IO 
    %PAS-F-FILALRACT, file already active
    >>494:   WRITELN ( message ) ;

    - Jan has some more information about additional (?) problems:

    After the crash we tried triggering TCC, but it didn't come back up.  Then
    we tried rebooting it by pressing the halt button.  That didn't work
    either. Then we tried power cycling it.  No luck.  We saw that it was stuck
    in self test #9.  John found in a book, that 9 means it's having problems
    talking to its console.  John then powered down TCC, then power cycle
    the terminal, then powered TCC back up.  Then it was ok.  We haven't
    had any problems since.
  
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 22-NOV-1993 Philippe: 

    - There has been 2 instances of TCC hanging. One reported by K.Johns at 
      12-NOV-1993 23:53:37.00, one reported by Jan at 19-NOV-1993 21:10:13.00.

    - Edebug shows a small number of Pool blocks.

    - build new system EWORK1:TRICS_V50.SYS_22NOV93;1 with 10,000 Pool blocks,
      and 1024 ports.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  5-NOV-1993 Philippe: 

    - Dan loads TCC with EWORK1:TRICS_V50.SYS_3NOV93;1

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3-NOV-1993 Philippe: 

    - Copy TRICS_V50 code and build new system EWORK1:TRICS_V50.SYS_3NOV93;1
        Large Tiles, 
        ITC fix, 
        MPt FMLN Programming    

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3-NOV-1993 Philippe: 

    - clean D0HTCC::[TRIGGER] directory.
        - delete ALL logfiles from October 
        - the system was rebooted on 
            TRICS_20OCT93.LOG;1    5738/5740    20-OCT-1993 18:01
            TRICS_26OCT93.LOG;2    4386/4390    26-OCT-1993 14:30
            TRICS_26OCT93.LOG;1     156/160     26-OCT-1993 14:18
            TRICS_29OCT93.LOG;2    1303/1305    29-OCT-1993 17:30
            TRICS_29OCT93.LOG;1     451/455     29-OCT-1993 16:57
        - [TRIGGER] now uses 6,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ......-1993 Philippe: 

    - clean D0HTCC::[TRIGGER] directory.
        - delete ALL logfiles from August
        - [TRIGGER] now uses 18,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 24-AUG-1993 Philippe: 

    - clean D0HTCC::[TRIGGER] directory.
        - delete ALL logfiles from June and July 
        - [TRIGGER] now uses 12,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  9-JUN-1993 Philippe: 

    - Jan G. turns Calorimeter Trigger OFF for rest of shutdown.
      TRICS_BOOT_AUXI.DAT updated to reduce coverage to minimum of 1 tower
      TT(eta,phi)= (+1,1) and limit the number of errors reported by TRICS.
      The same command was also sent to TRICS by hand.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  5-JUN-1993 Philippe: 

    - Power outage (snake in feeder line). Errors when TCC restarted while
      trigger was still off. Booted again after power restored. Everyhting ok.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2-JUN-1993 Philippe: 

    - clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*MAY*.LOG to MSUD01::DUA1:[BACKUP.LOGFILES_D0HALL]
        - also copy MPOOL*MAY*.LOG, LOG*MAY*.LOG, MAIL*MAY*.LOG
        - delete ALL logfiles from May
        - [TRIGGER] now uses  2,600 blocks

    - This is also an attempt to help the disk problems by using a different
      area of the disk, maybe on the outer perimeter of the disk, where the
      flux is greater. 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 27-MAY-1993 Philippe: 

    - TCC was rebooted. They had problem doing end run, and had no andor rate
      after intialize, also leds were off.

    - Looking in the logfile shows the last entry at 17:50 while the machine
      was only rebooted at 10:45.  This is suggesting we had a recurrence of
      the disk error, as on 21-may. This explanation is also consistent with
      problems during begin/end run and consistent with initialization problems
      (no LEDs, no andor rate, probably no sptrg #31) as both of these actions
      need to read a file of commands from the disk. 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 25-MAY-1993 Philippe: 

    - Stu found bug in ITC that was consuming system resources (missing Delete
      IO port on failure)
    - rebuild the TRICS_V40.SYS_NO_DISK system using new ITC.
    - build TRICS_V40.SYS_25MAY93 but do not load, 
      the system file in ESYS has >NOT< been overwriten
        
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 23-MAY-1993 Philippe: 

    - read logfile, no problem doing it, nothing learned. Last entry in TRICS
      logfile was at 7:49 21-MAY-1993, Jan's message was from 11am.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 21-MAY-1993 Philippe: 

    - TCC DUDRIVER job exited in the middle of physics running, TCC still
      running but cannot open TRICS_FORCE_UPDATE.DAT and thus cannot make end of
      run files. Is it a disk bad block? 

    - reboot, and run fine. I Will >not< try to read the logifles untill
      tomorrow's study period.

    - build a new system TRICS_V40.SYS_NO_DISK that doesn't mount the disk,
      ready to load if TCC fails again.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 20-MAY-1993 Philippe: 

    I believe there is a different problem with the current version of ITC,
showing up at least in TCC, where TRGMON connects and disconnects ITC
channels quite often.

    When an ITC channel is connected to TCC from TRGMON, 7 blocks of the
"System Pool" resource are allocated (seen using Edebug) but when the
channel is disconnected, only 6 blocks are released, leading to a net loss
of one block of system pool.

    This did not use to happen with the private version I had been using
until 8-APR-93. That is 7 blocks were allocated and 7 blocks were released.
The difference in the ITC I loaded is in the fix from MAR-92 "to reset
CCB[Channel].In_Use  on connect failures". I haven't tried yet to guess
where the bug is in ELN_CONNECT.EPAS. One block of this System Pool is used
for each Kernel object created. It is probably a PORT, MESSAGE or something
else that is created, and not deleted before it is created again.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 11-12-MAY-1993 Philippe: 
 
    - build system with new ITC (last attempt failed) which also has 
        logfile messages in ignore_problem routine
        old fix for preventing CPU hogging when reaching max number of channel
      system TRICS_V40.SYS_11MAY93 is loaded by Jan G. Wed. morning.

    - plot REPEAT_EDEBUG for previous system

    - restart REPEAT_EDEBUG
    
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 10-MAY-1993 Philippe: 

    - REPEAT_EDEBUG now shows pool size below 9000
      read MPOOL_SERVER_07MAY93.LOG, see truncate messages still there.
      Use Edebug to halt the ITC connect process and view the source file, it
      is NOT the new code. The cause probably was to MPOOL_SERVER.EXE not
      having been relinked on 4-MAY while building new system (ITC OLB not
      listed in MMS).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  7-MAY-1993 00:10 
    
    - Dan reboot TCC to read new BOOT_AUXI with active veto for begin/end run.

    - restart REPEAT_EDEBUG in the morning, previous recording (since 5-may)
      does not show a visible drop.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  5-MAY-1993 Philippe: 

    - build new system 4-MAY late evening with 
            Ports parameter increases 256 -> 1024
            New ITC from Stu Beta release which has
                fix to the recover connection update that deletes the port
                clear the truncated message flag that isn't used anywhere

    - system loaded by Jan G., today is accelerator study day

    - notice that the global caltrig energy is large and negative. 
      TRGMON's ADC dump shows that every tower is 0, 1 or 2.
      This is traced back to a failure loading the DAC_BYTE file. TRICS stopped
      after noticing that INIT_DAC_BYTES.LSM had values for eta > 16 (that are
      now turned off).

    - clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*APR*.LOG to MSUD01::DUA1:[BACKUP.LOGFILES_D0HALL]
        - also copy MPOOL*APR*.LOG, LOG*APR*.LOG, MAIL*APR*.LOG
        - delete ALL logfiles from April
        - [TRIGGER] now uses  9,700 blocks

    - use REPEAT_EDEBUG from D0MSU2 (MSU2:[TMP12.LAURENS.EDEBUG]) to monitor
      TCC in general and system poolin particular
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 26-APR-1993 Philippe: 

    - water leak in the cal trig. Turn off the last 2 racks.
      change TRICS_BOOT_AUXI to have a tower range of 1:16 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 23-APR-1993 Philippe: 
      
    - EDISPLAY shows that the pool size has dropped from 500 to 200 

    - build a new system with pool increased from 1024 to 10,000

    - There is beam in the machine and the pool size is dropping to 70.
        call Jan to find out when it is possible to reboot. A run is about to
        end, but the beam is lost right before the reboot anyway.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 21-APR-1993 Philippe: 

    - start watching D0HTCC with 
        SETHOST/LOG on D0MSU2 running EDISPLAY with 60 s refresh rate
        SETHOST/LOG on D0MSU2 running EDEBUG every 10 mn

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 20-APR-1993 Philippe: 

    Accelerator is rebuiling the Pbar stack and I get an oportunity to
investigate ITC truncate messages.

    - load MPOOL_SERVER.EXE from previous system (before ITC fix)
        no truncate message
    - add a report about the channel and the request in the error message,
      recompile ITC, relink and reload MPOOL_SERVER.EXE
        truncate messages are back
        not correlated to a particular channel or particular request type
    - find an old ITC library and relink
        truncate messages are still there
    - Stu Fuess thinks the truncate flag is never set and thus picks up random
      previous memory content
    - increase system parameters and reboot
        P1                      1024 -> 2048
        System Interrupt stack     2 ->  128
        System region size      1024 -> 2048

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 19-APR-1993 Philippe: 

    1) TRGMON timeout and ITC truncate messages.

    These ITC truncate messages have started appearing with the last TCC
system change on 8-APR-93. It is probably not a coincidence. They come
somewhat randomly. Unfortunately I didn't have the TCC monitoring pool
server also advertize which channel they come from, and I don't know what
to correlate them with. From what I have gathered so far, these errors are
associated with incoming messages. This is a flag reported by, but not
generated by the ELN ITC code. I am not quite sure if the flag is set by
the host ITC code or by the system services sending and receiving the
messages.

    2) multiple reboots on 17-APR-93

    I believe there was a problem and TCC needed to be rebooted. I believe
there was some confusion while trying to reboot and restart. I will not go
in details, but there is (again) evidence that TCC was still booting while
COOR was told to talk to it.

    I don't know if this string of 3 problems in a week is a fluctuation of
the previously "once per month" rate, or if it is linked to the latest
system change on 8-APR-93.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 18-APR-1993 Philippe: 

        Jan reports TCC being rebooted on the owl shift (Sunday morning). 
There were other problems at the time with the host, ethernet, ... All HSB
windows disappeared at about that time.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 17-APR-93 Philippe: 

        Jan reports notes from the DAQ log book 

 19:40  TRGMON can't connect to D0HTCC
        insufficient resources at remote node
 19:45  End run 63792.  Fail to read luminosity scalers.

        Seems like a good time to re-init a few things.
        stop/start COOR and Data Logger
        Trigger D0HTCC

    {We were having problems earlier on the day shift and I had asked
    them to stop/start COOR and data logger during the next shot setup.  
    The COOR/data logger connection was not right.}

        Boot unreachable L2 nodes ...

        COOR dies in downlaod.  Still a problem with D0HTCC.
        Reboot D0HTCC (push the button).  Initialize trigger framework.
        Still cannot connect to D0HTCC.
        EDEBUG D0HTCC - looks OK.  Trig_init still fails   {Initialize 
        framework with COOR}
        Power cycle D0HTCC.  Try TRIG_INIT several times.

        now downloading OK.

 20:40  Start run 63800

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 14-APR-1993 Philippe: 

    - get a call from the DAQEXP, Gene Alvarez.
      D0HTCC "crashed" during machine studies so that I have time to investigate

    %PAS-F-ERRDURPUT, error during PUT
    -KERNEL-F-DISCONNECT, circuit disconnected by partner
    Job 5, process 1705, program DUDRIVER has exited.
    Job 5, process 1704, program DUDRIVER has exited.

    - Answer to Gene, Jan G. :
    Some jobs and sub-processes had disappeared. The main problem that
Edebug showed was the "-KERNEL-F-DISCONNECT, circuit disconnected by partner"
exception inside a call to WRITELN. This makes little sense to me. The only
"circuit" that I can think of would be the connection that has to be made
inside the EPASCAL IO routines to the CONSOLE (VT300) driver. 

    At this time, I believe that the problem did not originate in Level 1
software but in the ELN Kernel. It could be a software or hardware problem
that screwed up (all?) the Kernel's datagram tables. (Not just for the Level 1
software, e.g. trying to SET HOST to D0HTCC also generates a similar
exceptions in the ECL process).

    - also later notice that D0HTCC is very low on "Pool" pages. Is this a
      cause or a consequence?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 13-APR-1993 Philippe: 

    - Found in the electronic captain logbook that TCC was booted over the
      weekend, ask Jan for what is in the DAQEXP logbook.

 April 10
 23:30  TRGMON has disconnected a number of times with 
        Error detected during > ITC disconnect
        %ITC-E-NO_CHANNEL, channel request has not been activated
        TRGMON cannot get data from TCC, 
        edebug D0HTCC
        raised exception in TRICS_V40
        489:  WRITELN(message)
        Trigger D0HTCC
 April 11
 0:00   loose all D0HSB windows
        COOR dies

 0:15   COOR dies - unable TO TALK TO TCC
        EDEBUG TCC - looks ok
        Triggered TCC
        Restart COOR

    - answer to Jan:

    The ethernet problems might explain some of the confusion and needing
to boot TCC a second time, but I am not sure that it explains everything.
It seems that one of the processes was halted in Edebug. There is no 
telling now what really happened. I hope that I will be around to investigate
if this happens again.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 10-11-APR-1993 Philippe: 

    - TCC was booted twice, at 10-APR-93 23:57 and 11-APR-93 00:30. 
      With no message from COOR in TRICS_10APR93.LOG. 
      TRICS_08APR93.LOG last message is from 23:17
      MPOOL_SERVER_08APR39.LOG has the following messages at 23:21
        E-EXC/MBX% Message Mailbox is Full but Not Signaled
        S-EXC/MBX% Flush_to_File now Servicing Exception Mailbox
        X-WAI/CNS% Console Locked for 5s, Recover: Force Unlock
        S-EXC/MBX% Exception Mailbox now empty
 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  8-APR-1993 Philippe: 

    - TCC booted to load EWORK1:TRICS_V40.SYS_7APR93.

    - MRBS_LOSS was moved from andor term #121 to 124

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 7-APR-1993 Philippe: DZERO::EWORK1:
    
    - V3.1 has been moved to ETRICS

    - version change from V3.1 to V4.0; system is EWORK1:TRICS_V40.SYS_7APR93

    - upgrade the remote console to be able to receive the messages from the
      other jobs (MAIL_SERVER, MPOOL_SERVER, LOG_SERVER). 

        - The remote console is still a sub-process of TRICS and created/deleted
          by TRICS, but relays messages from >all< the jobs. 

        - The remote console now also survives a temporary bottleneck while
          sending the messages to the host's terminal. If a message is not
          serviced by the remote console within 1 second, further message
          copying is suspended for 1 minute, then resumed with an error message
          notification. 

    - wait 5mn before starting MONIT POOL SERVER 
      (keep off Nina's monitoring and trgmon... while booting)

    - fix LOG_SERVER to gracefully survive a link-to-host problem 
      (e.g. exceeded quota) 

    - remove confusion between tree OFFSET and tree CORRECTION

    - implement all the tree browsing messages.

    - modify the method of servicing exceptions in exception handlers, 

        - all messages generated by an exception handler are now put in a 
          mailbox. An "exception tracing" state is set at the begining and
          cleared at the end of each exception handler. Any message generated,
          even indirectly by routines called by the exception handler are put on
          the mailbox stack. This method also makes the exception handler
          execute without interruption (no screen IO to wait for). The mailbox
          is then "signaled" to show that it needs emptying at the end of the
          exception handler. 
          
        - the mailbox only holds a maximum of 10 messages, which is plenty for
          all known cases. The mailbox counts the number of messages that
          overflow, and the time of the last entry.

        - Previously, the job of the process "flush to logfile" was to wait for
          a fixed time interval of 2mn30 to wake up and close the logfile.   
          It is now given the additional responsibility of also waking up when
          an exception handler signals the exception message mailbox as
          needing servicing. 
          The process "flush to logfile" will empty the mailbox to the screen
          and logfile (note that the messages keep their original time stamp). 
          These messages are prefixed by X- 
          The process flush to logfile can also unlock the console when it
          finds it remains locked for more than 5s. 

    - Other jobs (MAIL_SERVER, MPOOL_SERVER, LOG_SERVER) also receive the
      same treatment for their exception handlers and now have their own "flush
      to logfile" subprocess 

    - MPOOL_SERVER and ITC 
        Increase the maximum number of connections to MPOOL_SERVER from 10 to 15
        Also use a more recent version of ITC that has a fix to recover channel
        resources in case of connection failure.
        Add a fix to prevent indefinite loop when the maximum number of channels
        are connected.

    - make the "mailing to" message a system class message to have it in the
      logfile.

    - hardware updates
        Update the read/write mask of L1.5 Control MTG Ch 29 & 30 (long timeout)
        Quit initializing L1.5 control MTG channel 29
        Initialize L1.5 receiving MTG channels 1:19
        Fix mtgbusy, comint busy stretch PAL is at FA 1, not 31.
        
    - watch double-buffer
        Move screen message notifying of re-shynchronization to AFTER doing it
        Raise process priority (8 -> 7) for un-interrupted service

    - refresh monit pool reset comint
        Resetting COMINT to restore data flow is now done through the file
                TRICS_RESET_DIRECTIVES.DAT.  

    - INITIALIZE_TIME
        Remove the wait for 15 seconds before starting executing
        Clear the screen when starting, for better visibility
        In case of problem, wait 10 seconds and retry
    

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 5-APR-1993 Philippe: 
    
    - move to Daylight savings time by the following simple method
        Edebug> Create Job Initialize_Time

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 3-4-APR-1993 Philippe: 
    
    - clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*MAR*.LOG to MSUD01::DUA1:[BACKUP.LOGFILES_D0HALL]
        - also copy MPOOL*MAR*.LOG, LOG*MAR*.LOG, MAIL*MAR*.LOG
        - delete ALL logfiles from March
        - [TRIGGER] now uses 11,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2-APR-1993 about 13:30 Philippe: 
    
    - TCC locks up at the beginning of a store, but the run is already started.

    When Dan called me, I used Edebug to reach the node, and browse around.
I couldn't make any of the symptoms fit in a coherent manner. 
    - None of the ITC, or ELNCON connections could get to it.
    - None of the Level 1 processes appeared to be doing anything.
    - None of the processes ran out of memory, or other resources
    - I couldn't halt any but one (begin_end_run) of the L1 processes. 
    - even FAL wouldn't work, which is the first time I saw it. 
    - Only Edebug could talk to it.
    - I couldn't SET HOST either, meaning also no ELN monitoring. 

    I was starting to think it was an ELN kernel problem that had all the
processes stuck, or a hardware error in the primitive uVAX II CPU board
(.e.g. it only has a one-bit-per-byte parity checking in memory). A new
store was in and I gave up after 10 minutes and told Dan to reboot TCC.

    I had captured my session with Edebug in a logfile and picked
it apart, comparing memory usage and delta CPU time. It took me almost an
hour to reach the following conclusion (which fits all the symptoms), and TCC
was rebooted by then. I didn't get to use edebug and see what ITC was doing. 

    My current understanding of what happened is that one of the two
subprocesses that ITC created in the ELN server job for TRGMON became 100 %
busy doing something. I don't really understand ITC, but I believe this process
handles all incoming new connections. It is at a higher priority than regular
user jobs. The ELN system is not time-sharing and this job took every CPU cycle
it could find. 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17-MAR-1993 Philippe: 
    
    - clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*FEB*.LOG to MSUD01::DUA1:[BACKUP.LOGFILES_D0HALL]
        - also copy MPOOL*FEB*.LOG, LOG*FEB*.LOG, MAIL*FEB*.LOG
        - delete ALL logfiles from February
        - [TRIGGER] now uses 17,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29-MAR-1993 Philippe: 

    - there was this entry (3am?) in the daqexp logbook (reported by Jan G.)

        Fatal alarm -  TCC_Link_Err.COOR
                       Lose COMM_TKR's?
                       RESET_COOR
        Test trigger complains - Failure on effort to connect to D0HTCC 
        COOR dies
        restart COOR
        COOR.OUT file full of NETCON error %SYSTEM-F-REMRSCRC, insufficient
        system resources at remote node
        and
        DISCON error: %SYSTEM-F-IVCHAN, invalid lack of connection to D0HTCC
        TRIG_INIT fails in TALK
        TRGMON error detected during ITC connect to D0HTCC
               Insufficient system resources at remote node.
               Error:  Cannot get data from control computer.
               %ITC-E-NO_CHANNEL Channel requested has not been activated

        EDEBUG D0HTCC - Loading traceback from MAIL_SERVER.EXE
                 crashed in MSUTRGOUT:[TRG_TARGET.SOURCE_1WORK]MAIL_SERVER.EXE
                 LINE 458F:  ELN$UNLOCK_AREA(console_obj, console_synch^.lock)

        MCR NCP TR NO D0HTCC
        EDEBUG D0HTCC
                only jobs up are
                            XQDRIVER
                            CONSOLE
                            EDEBUGREM
                            DUDRIVER
                            INITIALIZE_TIME  priority 16-waiting
                no other jobs running
                INITIALIZE_TIME => D0HTCC is still booting
        L1 68k diskplay is going crazy!!
                The slave ready did not drop 
                Event with no Spec Trigs Fired, not transfered.
        But no runs are in progress.
        Reload 68k and immediately (at Go 95000) get the same message as above.

        Reboot D0SUPR, D0SEQR
        Retrigger D0HTCC

        TALK  INIT_TRIG
        Fatal error message goes away.
        Edebug looks ok.
        Exit.  Restart.  Everything's OK.
        Test trigger running smoothly.

    - here is part of my answer to Jan

    1) I have no doubt that there was a problem, and it was right to boot.

    2) Either they didn't wait long enough for TCC to boot, or the
       host failed to answer to the INITIALIZE_TIME task. 
       The trigger control software started two logfiles, at 3:14 and 3:16. 
       If the logbook is correct that they only had to trigger TCC 
       twice, then it looks like they didn't wait long enough. 
       There also was 2 initialize messages from COOR only 90 s apart at 3:20.

    3) I wouldn't worry much about the VME 68k until TCC is fully booted.

    Here is what I propose for improving the overall situation.

    1) John restoring FDDI and returning to fluid network link to the
       host(s) is going to help any TRGMON, and/or booting problem.

    2) I will improve the INITIALIZE_TIME task to do a better job at
       displaying what phase of the booting process it is in. And make it
       automatically retry in case of failure.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12-MAR-1993 Philippe: 
    
    - Dan has fixed COMINT to clear most recent data block while it is trying to
      start. The rate of resynchronization messages in the logfile should go
      from every 20 mn to almost never.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24-FEB-1993 Philippe: 

    - Dan loads TRICS_V31.SYS_11FEB93 into D0HTCC.  
      This is the fix to the Begin/End Run file Synchro problem with COOR 
      and it has 5x the margin for virtual address space. 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
12-FEB-1993 Philippe: 
    
    - clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*JAN*.LOG to MSUD01::DUA1:[BACKUP.LOGFILES_D0HALL]
        - delete ALL logfiles from January 
        - [TRIGGER] now uses 33,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 8-JAN-1993 Philippe: 
    
    - clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*DEC*.LOG to MSUD01::DUA1:[BACKUP.LOGFILES_D0HALL]
        - delete all logfiles from December
        - [TRIGGER] has now 8,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 7-JAN-1993  Philippe: 

    - new system, installed, EWORK1:TRICS_V31.SYS_7JAN93
        Add 2 messages in begin/end run: "file opened" and "done".
        Increase the priority of the begin/end run task by one unit.
        All this is on top of the modifications from 19-dec-92 
 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 19-DEC-1992 Philippe: 

    - New systemwas built, but never installed EWORK1:TRICS_V31.SYS_19DEC92
        Modify update_ruegister (more messages, data not write masked)
        Modify definition and initialization of FW TSS write A/B PAL
        Send messages at error during register initialization
        Add prescaler ratio to begin/end run file
        Initialize only a fraction of the l1.5 terms

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 3-DEC-1992 Philippe: 
    
    - clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*NOV*.LOG to MSUD01::DUA1:[BACKUP.LOGFILES_D0HALL]
        - delete all logfiles from November

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 12-NOV-1992 Philippe: 
    
    - clean D0HTCC::[TRIGGER] directory.
        - copy TRICS*OCT*.LOG to MSUD01::DUA1:[BACKUP.LOGFILES_D0HALL]
        - delete all logfiles from October
        - [TRIGGER] has now 17,000 blocks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 11-NOV-1992 Philippe: 

    - mpool_server 
        - fix mpool_server for Nina's messages
        - upgrade to skip building the same message as sent before
        - change exception handler messages 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 6-NOV-1992 Philippe: 

    - Add new monitoring message for foreign scalers.

    - update mesage to Nina's Cross system monitoring with Level 1.5 quantities

    - system loaded 7-nov-92

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21-OCT-1992 Philippe: 

    - new system with new version number 3.0 -> 3.1
        - new definition for mtgfwtss Ch #  4 as MTG05
                             mtgcttss Ch #  5 as MTG06 (latch/shift)           
                             mtgcttss Ch # 29 as MTG05 (IMLRO latch/shift)     
                             mtgbusy  Ch # 31 as FEBzGS01 (double buffer full) 
        - hardware initialize mtgbusy Ch # 31 to 0 (instead of 10)
        - fix bug in CHTCR test that was truncating error messages for phi 
        - tighten Find_DAC_byte "median" requirement 
        - Find_DAC_byte now has exception handler to close result file and
          resignal. This will now properly close the file when the test task is
          deleted. 
        - change the console messages for the "auxiliary init" to be less
          specific since the concept is used in other instances.
          "Command File Closed" instead of "Auxiliary Init File Closed", ...
        - fix bug in monit pool filler, (ZERO mpool_rec.twr) to solve non
          existing towers appearing excluded in TRGMON
        - delete area mpool_obj before quitting or after delete task, to
          prevent eating up virtual address space
        - fix bug in the result_file service that caused binary garbage in the
          first line of the result file (e.g. DAC_*.LOG)
        - Begin/End Run now forces a latch/shift using the file 
          D0HTCC::[TRIGGER]TRICS_FORCE_BUF_UPDATE.DAT 
        - the new pause run, .... messages are implemented. 

    - I forgot to make FIND_DAC use the proper value to take control
      of latch/shift. Fortunately, one can still start find_dac and (quickly)
      overwrite the register, as long as it is before the first pause.
    
    - make new system. Dan loads it.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29-SEP-1992 Philippe: 

    - fix FIND_DAC to make a requirement that the median is close to the
      proposed DAC_BYTE. This is to suppress problems where the gaussian tail
      low statistics produces an abnormal high reading.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29-SEP-1992 Philippe: 

    - fix FIND_DAC to make a requirement that the median is close to the
      proposed DAC_BYTE. This is to suppress problems where the gaussian tail
      low statistics produces an abnormal high reading.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 9-SEP-1992 Philippe:                                  

    - fix set_trgtwr_simu (used by exclude trigge tower) to still complete
      action in case of failure.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 9-JUL-1992 Philippe: 

    - upgrade to TRICS V3.0

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 3-JUL-1992 Philippe: 

    - failed attempt to upgrade to trics V3.0. Some problem were fixed with
      disk mounting missing from Ebuild. Still problem with TWB bits always on.

    - return to system from 1-jul-92 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1-JUL-1992 Philippe: 

    - bug fix in card address of L1.5 SBSC

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
26-JUN-1992 Philippe: 

    - add monitoring Level 1.5 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
18-JUN-1992 Philippe: 

    - new system: L1.5 bug fixes

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
11-12-JUN-1992 Philippe: 

    - new system 
        - fix bug initializing all triggers as l1.5 capabale
        - properly define new L1.5 PALs in FW TSS MTG, Hld Trf. MTG, St Dgt MTG 
        - define new L1 PALs listening to Level 1.5
        - tune Level 1.5 programming

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 4-JUN-1992 Philippe: 

    - new system 
        - ignores COOR's GEO_SECT DGTZ_OFF messages, for solving new level 1.5
        PAL prgramming problem in start digitize. 
        - Special TRICS intitialization of MTG FW TSS ch#14 incr.St.Dgt.Num to
        be ROM Gated.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 3-JUN-1992 Philippe: 

    - prepare new system using the new MENU with updated andor card test, 
      and including correct restoring of reference sets. Not loaded

    - use 1470 byte network segment.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
19-MAY-1992 Philippe: 

    - unexplained communication to TCC hanging shortly after beginning of run.
    - build a system without a disk. 
        - update SITE_DEPENDENT.CST             
        - update EBUILD, not mount the disk
        - arrange to open FOR003 on the host before zebra in MOD095_INIT_LSM.PAS
    - the diskless system didn't help
    - the best guess is that there was a hardware or configuration problem on
    the ethernet link to the host.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
13-MAY-1992 Philippe: 

    - new system 
        global threshold data base
        manage tree offsets
        restore caltrig towers, threshold, jet list progr
        monitor thresholds
    - manually modified by leaving out 200 EM and 200 HD counts of offset in
      Tier #3 in order to keep 400 counts of offset in Tier #4 Tot 
        MOD130_INIT_THRESHOLDS.PAS     
        MOD095_INIT_LSM.PAS            
    
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24-APR-1992 Philippe: 

    - build new system 
        prepared to look for the jet list argument andor card in mba 106.
        reset the new DBSCs in M101. 
        fix the TRGMON 68k state problem (items swapped)

    - build new system
        fix missing initialization of jet list andor card

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
23-APR-1992 Philippe: 

    - fix ITC max message size and rebuild system

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21-APR-1992 Philippe: 

    - copy EWORK1: (TRICS_V25) to ETRICS

    - build new system in EWORK1: in preparation for new COMINT PROMs
        copy MSU's new Code (TRICS_V26) to EWOKR1:
        modify MOD123_INIT_CBUS_CARDS.PAS to ignore FMLN cards

    - recompile official TRGMON to match new data format.
        update lv1_mpool_raw.inc in TRGMGR::SHTRGMON: directory
        update HTRGMON:TRGMON_DRIVER_LINK.OPT for new common block
        $ @COMPILE_TRGMON
        $ @LINK_TRGMON

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
20-APR-1992 Philippe: 

    - check and delete logfile TRICS_02APR92.LOG

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17-APR-1992 Philippe: 

    - D0HTTCC was rebooted, thus using latest system

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
13-APR-1992 Philippe: 

    - install in TRGUSER account directory [.DIRECT_TO_TCC] 
      RUN_PAUSE_RESUME.COM and PAUSE_RESUME.EXE

    - mail sent to Jan Guida, Joey Thompson

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 9-APR-1992 Philippe: 
    
    - Build new system, placed as next load file, but NO reboot made.
        hardware initialize L1 fired for bunch in FWTSS as gated, 
        fix bug in message parsing (R.Astur, second refset receives BAD PARAM) 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 2-APR-1992 Philippe: 
 
    - modify LOGIN.COM for TRGUSER so as to establish a permanent NETSERVER for
      use by MAIL, and write begin/end run.

    - This was discovered by tracking the code in SYS$SYSTEM:NETSERVER.COM

$! If this is a network request, tell NETSERVER.COM to create a PERMANENT
$! server (with a maximum of 1 permanent server).
$ IF ( F$MODE() .NES. "NETWORK") THEN EXIT
$ DEFINE NETSERVER$SERVERS_TRGUSER 1
$! AND set a timeout of 24 hours for any additional netserver that will be
$! created non-permanent.
$ DEFINE NETSERVER$TIMEOUT "0 23:59:59"
    
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1-APR-1992 Philippe: 

    - relink TRGMON with recent ITC, this appears to solve the hanging problem.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25-MAR-1992 Philippe: 

    - fix mail server. It was releasing the area to early, and thus could
      advertize the job done for the next message.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25-MAR-1992 Philippe: 

    - prepare a new system for bruce's test on WRT_HOST messages
        only acknowledge, no action
        WRT_HOST  BEG_RUN
        WRT_HOST  END_RUN
        WRT_HOST  SYNCHRO

    - move resulting set of trics V2.4 files EWORK1: -> ETRICS:
    
    - copy MSU's files from MSUHEP::EWORK1: to D0::EWORK1: and start building 
      new system TRICS V2.5
    
    - step up and now open file on host. No data yet.
    
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
20-MAR-1992 Philippe: 

    - Install new TRICS V2.4 system with
        - full turn 6 on 6 MTG, 
        - new find_dac, 
        - load_dac, 
        - no error on dbsc reset
        - add BOOT AUXI file, with find&loa DAC
        - init trigger number & sptrg strobe fwtss to ROM gated

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 5-MAR-1992 Philippe: 

    - There is a file in TRGCUR:SET_LEVEL1_BEAM_CROSSING_PERIOD.COM       
      There are 2 new files on D0HTCC::[TRIGGER] and TRGCUR:
      TRICS_INIT_AUXI.DAT_4_BUNCH                           
      TRICS_INIT_AUXI.DAT_6_BUNCH                           
      The COM_file copies the *.DAT_n_BUNCH to *.DAT   

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17-FEB-1992 Philippe, D0::EWORK1:

    - John installed ELN V4.3 last Friday.
    
    - save OLB from ELN V4.2    $ REN *.OLB TRICS_V23_DEB.OLB_ELNV42

    - start EWORK1: $ MMS 
      fails on MPOOL_SERVER.OBJ because ITC still references old $KERNELMSG
        - copy MSUHEP MSUTRGROOT:[TRG_LIB.ITC]ELNOLBS.COM, [.ELN]*.*, [.INC]*.*
        - $ @ ELNOLBS
        - $ REN DEB_ELN_ITC.OLB TRGOLB:

    - restart EWORK1: $ MMS/SKIP 

    - prepare system for next load.

    - send mail to John to rebuild D0$ITC:

    - try TRICS_V23.SYS  file on MSUD04::
      fails on INITIALIZE_TIME when asking boot node 557095 = 544.39!
      -> INITIALIZE_TIME probably needs recompiling.

    - $ DELETE TRICS_V23.SYS;
      $ DELETE INITIALIZE_TIME.EXE;
      $ DELETE UNLOCK_SHA.EXE;
      $ MMS/SKIP

    - retry TRICS_V23.SYS  file on MSUD04::

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30&31-JAN-1992 Philippe, D0::EWORK1

    - install new system with st_vs_rs, global_threshold, bug fix in reference
      set programming, error_filter

    - increase time constant to 0.5 s
    
    - saved in EWORK1: $ REN TRICS_V23.SYS TRICS_V23.SYS_31JAN92

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
21&22-JAN-1992 Philippe:

    - check logfiles up to TRICS_10JAN92.LOG

################################################################################
****  NOTE
################################################################################
VAXELN - V4.3    D0HTCC  UP:      2 19:16:49.21   5-APR-1993 10:01:18.08
                (57.560) IDLE:    2 07:29:46.54


PAGES:    11849/11849/26624            S0-REGION:      5143/5143/5280
POOL:     526/1024                     JOB-SLOTS:      8/24
LOADER:   0/0/0                        PROCESS-SLOTS:  25/80
                           R/W   R/O
Job# Program_name Mode Pri   Pages     State       Runtime
 2   XQDRIVER      K   1   131   104   WAIT      0 02:20:12.62
 3   CONSOLE       K   2   23    23    WAIT      0 00:09:37.09
 4   EDEBUGREM     K   3   2     27    WAIT      0 00:00:00.33
 5   DUDRIVER      K   4   89    143   WAIT      0 00:03:38.49
 7   FALSERVER     U   16  14    13    WAIT      0 00:00:53.04
 8   TRICS_V31     K   16  3713  1562  WAIT      0 07:30:49.77
 9   MAIL_SERVER   U   16  23    233   WAIT      0 00:00:00.75
 10  MPOOL_SERVER  U   16  372   258   WAIT      0 01:30:40.71
 11  LOG_SERVER    U   16  25    230   WAIT      0 00:00:00.06
 12  RTDRIVER      U   16  48    39    WAIT      0 00:00:00.69
 14  ECL           U   16  26    242   WAIT      0 00:00:00.25
 15  EDISPLAY      U   16  52    47    RUN       0 00:00:02.45
################################################################################
VAXELN - V4.3    D0HTCC  UP:      1 07:56:19.98   9-APR-1993 11:24:20.01
                (57.560) IDLE:    1 02:21:15.98


PAGES:    11716/11675/26624            S0-REGION:      5141/5141/5280
POOL:     517/1024                     JOB-SLOTS:      8/24
LOADER:   0/0/0                        PROCESS-SLOTS:  23/80
                           R/W   R/O
Job# Program_name Mode Pri   Pages     State       Runtime
 2   XQDRIVER      K   1   127   104   WAIT      0 01:03:59.23
 3   CONSOLE       K   2   20    23    WAIT      0 00:04:54.66
 4   EDEBUGREM     K   3   2     27    WAIT      0 00:00:00.00
 5   DUDRIVER      K   4   48    143   WAIT      0 00:02:41.41
 7   FALSERVER     U   16  6     13    WAIT      0 00:00:00.83
 8   TRICS_V40     K   16  3506  1637  WAIT      0 03:36:14.76
 9   MAIL_SERVER   U   16  33    244   WAIT      0 00:00:00.43
 10  MPOOL_SERVER  U   16  640   269   WAIT      0 00:41:53.04
 11  LOG_SERVER    U   16  35    242   WAIT      0 00:00:14.86
 12  RTDRIVER      U   16  48    39    WAIT      0 00:00:01.16
 13  ECL           U   16  26    242   WAIT      0 00:00:00.22
 14  EDISPLAY      U   16  52    47    RUN       0 00:00:01.64
################################################################################