Startup configuration of the VC709 and FLX709 firmware to transmit the LHC clock and L1A information to the Hub crates. Rev: 22-Aug-2019 These instructions and the supporting screenshots can be found here: https://web.pa.msu.edu/hep/atlas/l1calo/hub/reference/ttc_documents/hubttc_vc709_startup/ Section 0 is not strictly necessary. It is not needed for ssh and x2go access. Sections 1 and 2 should happen automatically. Skip directly to section 3 then 4 unless there is a suspicion of a firmware or driver problem. Rebooting should not affect the FLX709 firmware. These instructions are needed for power up only. 0) Workaround to start a graphical console ------------------------------------------- Left unattended the booting sequence does not end with a graphical login screen. The screen is left with a static "Scientific Linux 6" picture and no alternate console. The workaround is to catch hub ttc while it boots and while the screen displays "Scientific Linux 6" with a spinning wheel below and type . This switches to the ascii version of the booting message stream. Let it finish booting. The last message says: "Starting x2gocleansessions" start an alternate command line console login as hubuser type: startx 1) VC709 auto-configured with FLX709 firmware ---------------------------------------------- When the hubttc computer is powered up, the VC709 will also power up. The "mode pins" on the VC709 tell the FPGA to automatically configure itself with the FLX709 firmware from the flash memory. If the FPGA was not configured, use instructions [A] below. 2) FLX kernel driver started automatically ------------------------------------------- When the linux OS boots, the kernel device drivers called "drivers_flx" should start automatically and automatically find the FLX709 firmware. If you are watching the boot messages as described in Section 0 you should see near the end of the boot sequence: Starting cmem driver major number for cmem_rcc is 245 Starting io_rcc driver major number for io_rcc is 244 1 flx cards found Starting flx driver major number for flx is 243 To check in more deltails or diagnose problems see instructions [B] below. The felix software should find the card and report [root@hubttc ~]# flx-info General information There are 1 FLX cards installed in this computer ------------------- Reg Map Version 4.3 Card ID: FLX-709 FPGA DNA: 22536180619915348 FW version date: 24/07/18 13:23 GIT tag : rm-4.3 GIT commit number: 59 GIT hash : 0x0000000064dec00a FIRMWARE MODE: GBT Output of lspci | grep Xil: 06:00.0 Communication controller: Xilinx Corporation FPGA Card XC7VX690T Interrupts, descriptors & channels ---------------------------------- Number of interrupts : 8 Number of descriptors: 2 Number of channels : 4 Links and GBT settings ---------------------------------- Number of channels : 4 GBT Wrapper generated : YES Optical transceivers : 4 Clock resources ---------------------------------- MAIN clock source : LCLK fixed Internal PLL Lock : YES ADN2814 TTC Status : ON Note: This is the software version we need to use for this version of the flex driver and firmware. [root@hubttc ~]# flx-init -V This is version flxcard-03-02-06-83-g0f942bf of flx-init If flx-info fails, as shown below, use instructions [B] below [hubuser@hubttc ~]$ flx-info ERROR. Exception thrown: Failed to open /dev/flx. 3) Setup the VC709 to use LHC clock ----------------------------------- Login to the hubuser account via the console or remotely via x2go or 'ssh -X hubuser@hubttc.pa.msu.edu' (don't forget "-X" which is needed for elinkconfig) Setup the SI5345 on the TTCfx-v3 mezzanine [hubuser@hubttc ~]$ flx-init -T 2 Hard resetting the Si5345 Beginning configuration process... finish initialization procedure configuration done ... enabling output ... LOS register = 0x00 Sticky LOS register = 0x00 LOL register = 0x02 LOL register = 0x00 Found lock in 2 seconds Sticky LOL register = 0x04 TTC done. Not touching IDT and GBT..... Switch from using the internal clock to using the external LHC clock. We do not know a command line to tell the FLX709 to select and follow the LHC clock for this version of the software. So we use elinkconfig instead. [hubuser@hubttc ~]$ elinkconfig & A window should pop up, cf screenshot elinkconfig_0_start.png Click the button on the top row "Clock..." select "TTC" and click "Ok" cf. elinkconfig_3_clock_step1_select and elinkconfig_3_clock_step2_select_ttc.png Note: keep elinkconfig running as it will be used in section 4. The Hub(s) receiving the optical link should now be able to synchronize their clocks to the LHC clock Re-initialize the FELIX firmware, for good measure. [hubuser@hubttc ~]$ flx-init flx-init: warning: Not all channels align! flx-init: warning: 4 channels not aligned Go to section 4 or view the FLX709 status . 'flx-info' should now show that the LHC clock is used ("TTC fixed") [root@hubttc ~]# flx-info General information There are 1 FLX cards installed in this computer ------------------- Reg Map Version 4.3 Card ID: FLX-709 FPGA DNA: 22536180619915348 FW version date: 24/07/18 13:23 GIT tag : rm-4.3 GIT commit number: 59 GIT hash : 0x0000000064dec00a FIRMWARE MODE: GBT Output of lspci | grep Xil: 06:00.0 Communication controller: Xilinx Corporation FPGA Card XC7VX690T Interrupts, descriptors & channels ---------------------------------- Number of interrupts : 8 Number of descriptors: 2 Number of channels : 4 Links and GBT settings ---------------------------------- Number of channels : 4 GBT Wrapper generated : YES Optical transceivers : 4 Clock resources ---------------------------------- MAIN clock source: TTC fixed Internal PLL Lock : YES ADN2814 TTC Status: ON You can also check the clock recovery status: [hubuser@hubttc ~]$ flx-info ADN2814 TTC Status: ON Loss of Signal Status: 0 Static Loss of Lock: 0 Loss of Lock Status: 0 or, it may report, depending on previous history: [hubuser@hubttc ~]$ flx-info ADN2814 TTC Status: ON Loss of Signal Status: 0 Static Loss of Lock: 1 Loss of Lock Status: 0 4) Configure the GBT payload on the optical links ------------------------------------------------- The elinks and egroups and physical links need to be configured to start transmitting L1A on all 4 SFP ports. If not already done from section 3, login to the hubuser account via the console or remotely via x2go or 'ssh -X hubuser@hubttc.pa.msu.edu' (don't forget "-X" which is needed for elinkconfig) and start elinkconfig [hubuser@hubttc ~]$ elinkconfig & A window should pop up, cf screenshot elinkconfig_0_start.png - on the right half labelled "From-Host" select check-box for EPATH:1 (bottom left of right half) *AND* select the drop-down field to change from "8b10b" to "TTC-4" cf. elinkconfig_4_L1A_step1_select_EPATH1.png - deselect checkbox for EPATH 4,5,6,7 (top-right) cf. elinkconfig_4_L1A_step2_deselect_EPATH4567.png - deselect EC (right side) cf. elinkconfig_4_L1A_step3_deselect_EC.png - click Egroup "Repl 2 All" (lower-right) then "OK" cf. elinkconfig_4_L1A_step4_egroup_repl2all.png and elinkconfig_4_L1A_step5_egroup_repl2all_ok.png - click Link "Repl 2 All" (near top) then "OK" cf. elinkconfig_4_L1A_step6_link_repl2all.png and elinkconfig_4_L1A_step7_link_repl2all_ok.png - click "Generate/Upload" and in pop-up window click "Upload" then "Close" cf. elinkconfig_4_L1A_step8_link_upload.png and elinkconfig_4_L1A_step9_link_upload_upload.png and elinkconfig_4_L1A_step10_link_upload_done.png - click "quit" to exit elinkconfig cf. elinkconfig_4_L1A_step11_quit.png It is not obvious why this last step is needed (but it is needed) as 'fdaq' is a command for receiving data from FLX709. This must be required to start sending the configured payload and wait for a response from systems sending data back. [hubuser@hubttc ~]$ fdaq -E -t 1 Consume FLX-card data while checking the data (blockheader and trailers); stops when an error is encountered Opened FLX-card 0, firmw GBT-4ch-709-1807241515-GIT:rm-4.3/59 channels=4 (cmem buffersize=1073741824) **START(emulator-ext)** using DMA #0 polling -> 1 sec, Rates: recv 0.0 MB/s, file 0.0 MB/s; Total: recvd 0 B, file 0 B; Buffer: 0%, wraps 0 **STOP** -> Data checked: Blocks 0, Errors: header=0 trailer=0 Exiting.. Now the L1A requests coming from the TTC crate should be forwarded to the Hub(s) at the location expected by the Hub FW within the GBT payload. ============================================================================================================== A) diagnostics and recovery instructions to configure FLX firmware ------------------------------------------------------------------ The FLX709 firware should be automatically configured at power up. If there is suspicion this did not happen login as hubuser via the local console or x2go and start vivado. Check on the state of the FPGA being "Programmed". Note: A USB cable is left permanently connected to the JTAG interface of the VC709. The firmware files we use are located here [root@hubttc ~]# ls -l /home/hubuser/Xilinx/FLX709/000_KnownGood/ total 71312 drwxrwxr-x. 3 hubuser hubuser 4096 May 13 12:28 BuildFlashFile -rw-rw-r--. 1 hubuser hubuser 19468734 Jul 30 2018 FLX709_GBT_RM0403_4CH_CLKSELECT_GIT_RevertFromHost3_rm-4.3_59_180724_15_15.bit -rw-r--r--. 1 root root 53543472 May 29 16:01 FLX709_GBT_RM0403_4CH_CLKSELECT_GIT_RevertFromHost3_rm-4.3_59_180724_15_15.mcs To re-program the VC709 flash: right-click xc7vx690t_0 | add configuration memory device | filter Micron+1024+bpi+x16 select mt28gu01gaax1e-bpi-x16 | ok Do you want to program the configuration memory device now? | OK configuration file: navigate to /home/hubuser/Xilinx/FLX709/000_KnownGood/FLX709_GBT_RM0403_4CH_CLKSELECT_GIT_RevertFromHost3_rm-4.3_59_180724_15_15.mcs default fine for rest | OK (this takes a while... then "success" message) Or to directly configure the FPGA use: /home/hubuser/Xilinx/FLX709/000_KnownGood/FLX709_GBT_RM0403_4CH_CLKSELECT_GIT_RevertFromHost3_rm-4.3_59_180724_15_15.bit Note: From past experience, if the VC709 FPGA is reconfigured, the driver does not seem to be able to reconnect to the FLX709 firmware. A reboot is the only known solution. B) diagnostics and recovery instructions to start flex driver ------------------------------------------------------------- To check login to the hubuser account via the console or remotely via x2go or 'ssh -X hubuser@hubttc.pa.msu.edu' The driver should have found the card and can report information. This must show "tdaq710_for_felix_4.0.2" for this FLX709 firmware version. Note that this status also shows the version of the FLX firmware "rm-4.3" [hubuser@hubttc ~]$ service drivers_flx status cmem_rcc 1085080 0 >>>>>> Status of the cmem_rcc driver CMEM RCC driver for release tdaq710_for_felix_4.0.2 (based on tag ROSRCDdrivers-00-01-00) The driver was loaded with these parameters: gfpbpa_size = 4096 gfpbpa_quantum = 4 gfpbpa_zone = 0 __get_free_pages PID | Handle | Phys. address | Size | Locked | Order| Name GFPBPA (size = 4096 MB, base = 0x00000004f9400000) PID | Handle | Phys. address | Size | Locked | Name The command 'echo > /proc/cmem_rcc', executed as root, allows you to interact with the driver. Possible actions are: debug -> enable debugging nodebug -> disable debugging elog -> Log errors to /var/log/messages noelog -> Do not log errors to /var/log/messages freelock -> release all locked segments io_rcc 16880 0 >>>>>> Status of the io_rcc driver IO RCC driver for release tdaq710_for_felix_4.0.2 (based on tag ROSRCDdrivers-00-01-00) Dumping table of linked devices Handle | Vendor ID | Device ID | Occurrence | Process ID The command 'echo > /proc/io_rcc', executed as root, allows you to interact with the driver. Possible actions are: debug -> enable debugging nodebug -> disable debugging elog -> Log errors to /var/log/messages noelog -> Do not log errors to /var/log/messages Current values of the parameter(s) debug = 0 errorlog = 1 flx 35135 0 >>>>>> Status of the flx driver FLX driver for RM4.0 F/W and TDAQ for release tdaq710_for_felix_4.0.2. Distributed with driver RPM 4.0.2 Debug = 0 Number of cards detected = 1 Locked resources card | global_locks =============|============= 0 | 0x00000000 Locked resources card | resource bit | PID | tag =====|==============|=========|===== Card 0: Card type : 709 Device type : 0x7038 FPGA_DNA : 0x7071b054 Reg Map Version : 4.3 Build revision (GIT version): rm-4.3 Date and time : 24-7-2018 at 13h23 GIT commit number : 59 GIT hash : 0x64dec00a Firmware mode : GBT Number of descriptors : 2, Number of interrupts : 8 Interrupt count | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Interrupt flag | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Interrupt mask | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | MSIX PBA 00000000 The command 'echo > /proc/flx', executed as root, allows you to interact with the driver. Possible actions are: debug -> Enable debugging nodebug -> Disable debugging elog -> Log errors to /var/log/message noelog -> Do not log errors to /var/log/message rm3 -> Enable compatibility with RM3 F/W rm4 -> Disable compatibility with RM3 F/W clearlock -> Clear all lock bits (Attention: Close processes that hole lock bits before you do this) If the driver is started but cannot find the card the report will look like this [root@hubttc ~]# service drivers_flx status cmem_rcc 1085080 0 >>>>>> Status of the cmem_rcc driver CMEM RCC driver for release tdaq710_for_felix_2.0.2 (based on tag undefined) The driver was loaded with these parameters: gfpbpa_size = 4096 gfpbpa_quantum = 4 gfpbpa_zone = 0 __get_free_pages PID | Handle | Phys. address | Size | Locked | Order| Name GFPBPA (size = 4096 MB, base = 0x00000001fe000000) PID | Handle | Phys. address | Size | Locked | Name The command 'echo > /proc/cmem_rcc', executed as root, allows you to interact with the driver. Possible actions are: debug -> enable debugging nodebug -> disable debugging elog -> Log errors to /var/log/messages noelog -> Do not log errors to /var/log/messages freelock -> release all locked segments io_rcc 16880 0 >>>>>> Status of the io_rcc driver IO RCC driver for release tdaq710_for_felix_2.0.2 (based on tag undefined) Dumping table of linked devices Handle | Vendor ID | Device ID | Occurrence | Process ID The command 'echo > /proc/io_rcc', executed as root, allows you to interact with the driver. Possible actions are: debug -> enable debugging nodebug -> disable debugging elog -> Log errors to /var/log/messages noelog -> Do not log errors to /var/log/messages Current values of the parameter(s) debug = 0 errorlog = 1 If the driver was not started, check that it was set to start automatically (this should say "5:on") [root@hubttc FLX709]# chkconfig --list drivers_flx drivers_flx 0:off 1:off 2:on 3:on 4:on 5:on 6:off The driver can be be started and stopped manually. The screen capture below shows a case where the driver cannot connect to the card. [root@hubttc ~]# service drivers_flx stop Shutting down cmem_rcc driver Shutting down io_rcc driver [root@hubttc ~]# service drivers_flx start Starting cmem driver major number for cmem_rcc is 245 Starting io_rcc driver major number for io_rcc is 244 0 flx cards found Note: In the early days we did not have enough memory (no more problem with 12G) and the cmem_rcc driver which tries to grab contiguous memory would either always fail or succeed on first boot but not on restart.