19-APR-1993 TCC boot procedure ------------------ This is the short summary of the content of this file - What is the problem: collect evidence. Call P.Laurens if possible - Use NCP to TRIGGER the node. For a complete crash need to push "restart" - Wait about 5mn, and check TCC's console, look for INITIALIZATION COMPLETE. - Make COOR initialize the trigger These are notes and details related to booting D0TCC What is the problem? ==================== If time allows to investigate the problem, call Philippe FNBIT::MSUHEP::LAURENS (517)355-8525, Home (517)372-9849 Please record in the logbook what the various symptoms are - What does COOR say of the connection to TCC? - What does TRGMON say? If even Edebug cannot connect to the node, it might be that the ELN system got in enough trouble that it went down to Local-Edebug mode, and a remote triggering will not be succesful. Go look a the TCC console in the MCH and look for an Edebug> prompt buried in the rest of the screen. If that is the case, there is nothing useful to do, but push the "restart" button on the front of the microvax. - Edebug: (see attached reference for Edebug screenful) Is there a process in debug-wait mode? yes: Edebug> show call also Edebug> examine ... if you want to look at a variable Edebug> show sys are all the jobs there? Edebug> show job 8 are all the processes there? Does TCC's console in the MCH show anything peculiar? Are there any error messages? Does the screen update (e.g. does the frog blink?) Is there a Local-Edebug prompt? (Edebug> prompt) Is the CPU halted? (>>> prompt) Boot TCC ======== TCC is defined in the NCP database of D0HSC. (Remember that triggering a node requires the OPER privilege). On D0HSC, type $ MC NCP TRIGGER NODE D0HTCC from another machine, type $ MC NCP TELL D0HSC TRIGGER NODE D0HTCC Note: Remember that an ELN node can only be remotely triggered if its (old) ELN system is still running (and if this option was specified at system build time). In particular, if the CPU is halted (>>> prompt), or in Local-Edebug mode, triggering the node will NOT be succesful. Boot Phase 0: System Tests (only in the case of hard boot) ------------- During this phase, the system is unreachable to Edebug. If remote triggering fails, push the "restart" button in the front of the Microvax enclosure. The TCC console should then say "Performing normal system tests" and start counting down 7..6..5..4..3.. It should automatically try loading a system. If, for some reason, it stops with the >>> prompt, type B XQA0 on the console keyboard (>>> B will work too) This phase takes less than 2 minutes Boot Phase 1: ELN system download ------------- During this phase, the system is unreachable to Edebug. The TCC console should then count down 2..1..0.. (note that these messages might be burried in the rest of the old screen messages). This phase can take a minute or two depending on network and D0HSC availability. Boot Phase 2: ELN system initializing ------------- During this phase, the system is still unreachable to Edebug. The TCC console shows (note that these messages might be burried in the rest of the old messages): %VAXELN system initializing VAXELN V4.3 QBUS It is during this phase that ELN is checking and mounting TCC's disk. This phase takes about one minute. Boot Phase 3: Initialize Time ------------- The system is now reachable by Edebug. You don't need to be using Edebug, but if you were you would see: Edebug CTRL/C>sho system Available -- Pages: 13330, Page table slots: 14/64, Pool blocks: 835 Uptime: 0 00:00:53.72 Time used by past jobs: 0 Idle Time CPU 0: 0 00:00:52.49 Job 2, program XQDRIVER, priority 1 is waiting. Job 3, program CONSOLE, priority 2 is waiting. Job 4, program EDEBUGREM, priority 3 is running. Job 5, program DUDRIVER, priority 4 is waiting. Job 6, program INITIALIZE_TIME, priority 16 is waiting. The job Initialize_Time is required to complete before the rest of the Level 1 software can start Executing. Initialize_Time connects to the file TELL_TIME.COM in the TRGUSER account. Initialize_Time clears the TCC console screen and displays: INIT_TIME: Requesting current time from boot node 58371 and eventually INIT_TIME: The time has been reset to ... If a problem occurs, an error message is displayed and the operation will be retried 10 s later untill it is successful. Please read the message and try to solve the problem. Note: If you get the following error message: INIT_TIME: %KERNEL-F-NO_SUCH_NAME, no such name Check the existence of USER1:[TRGUSER]TELL_TIME.COM If the file is gone, get a copy from USER1:[TRGMGR.TRG_LIB.MISC_SOURCE] This phase takes from 1 to 30 seconds depending on whether a new NETSERVER needs to be started on D0HSC::TRGUSER Boot Phase 3: Level 1 Software Initializing ------------- The Level 1 Jobs are now created and start initializing. The Level 1 Trigger Control Software (TRICS) loads some files from the local disk, defines and initializes all the data structures supporting the control of the trigger. It also performs a full initialization of the Trigger Framework and Calorimeter Trigger. During this phase not all the TRICS subprocesses have yet been created You don't need to be using Edebug, but if you were you would see: Edebug CTRL/C>sho job 8 Job 8, program TRICS_V40, priority 16 is ready. Shared read/write size: 1629184. Read only size: 838144 Process 1, priority 8, is running. Stack size: 256000. CPU time: 0 00:00:14.37 Process 2, FLUSH->FILE, priority 8, is waiting. Stack size: 256000. CPU time: 0 00:00:00.02 Process 3, DISPATCH, priority 8, is waiting. Stack size: 256000. CPU time: 0 00:00:00.03 Accumulated CPU time for this job: 0 00:00:27.62 This phase takes about 2 minutes. Note that it will take longer if some hardware problem is detected and needs to be reported. Boot Completed: COOR can now connect to TCC --------------- The following message is displayed on the TCC console S-INI/DON% INITIALIZATION COMPLETE The Frog on the TCC console starts blinking to show that the monitoring information is being refreshed. All the necessary sub-processes of TRICS have now been created, in particular the sub-process of TRICS reserved for the connection with COOR. Process 6, CON CHANNEL 1, priority 8, is waiting. Stack size: 256000. CPU time: 0 00:01:50.40 -------------------------------------------------------------------------------- Normal report from Edebug session to D0TCC Edebug CTRL/C> Show System Available -- Pages: 11964, Page table slots: 10/34, Pool blocks: 584 Uptime: 1 10:55:42.69 Time used by past jobs: 0 00:00:00.11 Idle Time CPU 0: 1 04:50:45.60 Job 2, program XQDRIVER, priority 1 is waiting. Job 3, program CONSOLE, priority 2 is waiting. Job 4, program EDEBUGREM, priority 3 is running. Job 5, program DUDRIVER, priority 4 is waiting. Job 7, program FALSERVER, priority 16 is waiting. Job 8, program TRICS_V40, priority 16 is waiting. Job 9, program MAIL_SERVER, priority 16 is waiting. Job 10, program MPOOL_SERVER, priority 16 is waiting. Job 11, program LOG_SERVER, priority 16 is waiting. Job 12, program RTDRIVER, priority 16 is waiting. Edebug CTRL/C> Show Job 8 Job 8, program TRICS_V40, priority 16 is waiting. Shared read/write size: 1816576. Read only size: 838144 Process 1, INITIALIZE, priority 8, is suspended. Stack size: 256000. CPU time: 0 00:01:04.50 Process 2, FLUSH->FILE, priority 8, is waiting. Stack size: 256000. CPU time: 0 00:00:01.04 Process 3, DISPATCH, priority 8, is waiting. Stack size: 256000. CPU time: 0 00:13:11.22 Process 5, RFRSH M_POOL, priority 8, is waiting. Stack size: 256000. CPU time: 0 03:42:45.70 Process 6, CON CHANNEL 1, priority 8, is waiting. Stack size: 256000. CPU time: 0 00:01:50.40 Process 7, CON CHANNEL 2, priority 8, is waiting. Stack size: 256000. CPU time: 0 00:00:00.04 Process 8, CON CHANNEL 3, priority 8, is waiting. Stack size: 256000. CPU time: 0 00:00:00.07 Process 9, CON CHANNEL 4, priority 8, is waiting. Stack size: 256000. CPU time: 0 00:00:00.04 Process 10, WATCH DBL BUF, priority 7, is waiting. Stack size: 256000. CPU time: 0 00:00:00.35 Process 11, BEG/END_RUN, priority 7, is waiting. Stack size: 256000. CPU time: 0 00:02:43.36 Accumulated CPU time for this job: 0 04:01:49.75