COOR-TCC Protocol for Run II
                         ============================
                                 Guidelines
                                 ==========

Updated: 7/14/97

Short history of COOR-TCC protocol
----------------------------------

    Prototype version of COOR-TCC protocol created for 1987 test beam
        Syntax often awkward and inconsistent
            ASCII but often un-parsable by human
            No consistency of format (mixed binary, decimal, Hex)

    COOR-TCC Protocol reworked before commissioning at D0 Hall
        Keep some features
            ELNCON,
            Message ID field,
            Fixed keyword command field
        Change some features
            Add more messages
            More intuitive
            Unify syntax
        TCC was a uVAX II
            syntax was kept simple to match limited processing power
        Produce written specification/documentation

    Several upgrades during Run I to match hardware upgrades
        Add L1 Cal. Trig
        Add Begin/End Run scaler snapshots
        Add L1.5 Framework
        Add L1.5 Cal. Trig

Perceived Advantages of Run I protocol (things to preserve)
--------------------

    TCC sends one acknowledgement message per COOR command message

    TCC acknowledgement always instantaneous
        Execution takes much less time than transfer time (0.1 s)
        Except for
            Begin/End Run File (from 2 s to 2 mn depending on host file server)
                solution= TCC launch task and COOR synchronize later
            Initialization : simply takes long (1 mn in run I with no errors)
            Very few L1CT messages took about 1 second (e.g. Missing Pt thresh)

    All commands are human readable ASCII
        Keywords for message type
        Some other keywords fields and values (in L1CT)
        With common/uniform format
        Few simple syntax rules
        All numbers are decimal strings (with some keywords like POS_ETA)

    Same command interface used on L1 Simulator
        Clean Interface between COOR and Trigger hardware
        Easy to compose/send test messages to TCC online or to simulator
        Easy to check during commissioning
        Easy to check current/archived messages and understand configuration

Perceived Disadvantages of Run I Protocol (things to improve)
-----------------------

    Lots of individual messages.
        Would scale up to 1000 messages in Run II for 128 SpTrg to download.
        Run I Syntax would allow programming several SpTrg in one message but
        only for one type of resource (e.g. program several prescale ratios).
        This wasn't used except for SpTrg enable/disable messages.  Probably
        because it was orthogonal to the way COOR organized its actions (per
        SpTrg as opposed as per property).

    Message Roundtrip latency too slow to support lots of individual messages
        0.1-0.2 s per message + acknowledge
        Latency is in network layers, and serialization, NOT in processing

    Some messages were somewhat useless or redundant
        Some functionality was never exercised (e.g. independently specifying
        list of Geographic Sections to digitize and list of Front-End Busy to
        listen to).
        Some programming never departed from standard default configuration
        (e.g. SpTrg always told to obey L2 disable).

    Action verbs limited to fixed location and fixed length
        This might have been appropriate for limited power of uVAXII
        but not necessary today

    Acknowledge messages were also fixed fields/fixed length
        And were content-poor in term of diagnostics/debug information

    Poor feedback from COOR back to user or to DAQEXP
        Error status not displayed and/or not associated with particular
        message (e.g. show difference between BAD PARAM and BAD ERROR)
        Suplemental error information lost (e.g. Begin/End Run file errors).
        Some diagnostic information sent to COOR's Log files, but not all.

    Sub-System separation not very apparent in messages
    L1 FW / L1.5 FW / L1 Cal / L1.5 Cal

    Initialize message took long and COOR would not wait
        Especially because (I think) the TRIGGER INIT used a side door to COOR
            and didn't seem (I think) well synchronized with TAKER requests
        We had users and novice DAQEXP confused several times because of that.
        This also occasionally caused a loss of synchronization where COOR
        would parse acknowlegement for messsage n-1 instead of n.
        The point is that COOR must wait for completion of initialization.
        There is no way around this, and no point for COOR getting any further
        if TCC isn't done initializing.

    COOR would always loose the first message after loosing connection to TCC
    (e.g. after TCC reboot)
        COOR needs to resend messages after re-connecting.
        This was addressed but never made to actually work.

Worth mentioning, but not a real problem, just an overall choice:

    Message information did not include
        specific trigger logical name
        local/global run number for SpTrg
        And/Or term logical names
    This knowledge transfer was not needed for triggering or programming.
    Some people complained that TRGMON did not report that kind of info.
    Note that this information could have been obtained by TRGMON from COOR
    and/or COOR input/output files. TRGMON was/is a low level hardware
    monitoring... but is also used by shifters and detector user.


Guidelines for Run II
---------------------

    Keep simple ASCII format, improve keywords and formatting

    Keep one acknowledge per message, or per message group

    Improve error information in acknowledgement
    And improve path back to TAKER and/or DAQEXP
    with view of offending message and acknowledge status

    Condense the number of messages
        Group the messages: e.g. one message per SpTrg
            Allow setting several/all properties of a sptrg in one transaction
            Still need separate message to enable/disable one or a set of SpTrg
            Still need separate message to change prescale ratios between runs

    Condense the length of messages by using ranges in syntax (e.g. a range of
    geographic sections 0:127)
        Carries same information as exhaustive list
        Actually emphasizes contiguous groups and highlights the holes
        More compact
        This was used with great success in Run I L1 CT Reference Sets.

    Minimize the number of SpTrg features to setup
    but without loosing on functionality
        Still keep widest functionality/flexibility accessible at COOR level
        Implement all message types necessary to access all bells and whistles,
        But suppress any messages that match standard/normal default settings.

    Split "INIT" in 2 pieces (because of Run II use of FPGA technology)
        Download all FPGA configurations (only after power-up, slow = minutes)
        Reset to default programming (clean restart, faster = seconds)

    Trigger Exposure Groups
        For luminosity accounting (avoiding scalers for 128 SpTrg * 160 Bunch)
        There will be up to 8 Trigger Exposure Groups (Run I had "sort of" 1-3)
            Split off some And-Or Term used as Beam Quality (e.g. SCINT_VETO)
            Limit number of distinct Beam Quality And-Or Term groupings
            Limit number of distinct subsets of geographic sections for readout

    How do we handle the "heartbeat trigger"?
        The idea it to keep the whole D0DAQ cycling at ~0.2 Hz between runs.
        TCC needs a SpTrg reserved and programmed with the auto-disable feature
        TCC watches for 5 sec timeout on normal event flow and force an event.
        TCC must know which SpTrg it is so that it can hit the right one
        Should it be 100% programmed by COOR or part of TCC initialization?
        Other parts of the DAQ system must also either be programmed to answer
        this heartbeat, or know what to do by default.
        How much enable/disable control left to COOR and/or Level 3?

Concepts NEW for Run II

        1) DZero-wide Heartbeat Trigger
            In run I we had an internal heartbeat trigger, without stimulus to
            geographic sections and without readout to L3.

        2) Trigger Exposure Groups
            In run I we had, in essence, all Specific Triggers in one Group
            (all SpTrg readout same Geographic Sections). There were some
            variants (e.g. L0 single interaction flag) that were also monitored
            and could be counted as equivalent to a total of 1-3 groups.

        3) Level 2 Trigger Framework
            In Run I we had the L1.5 Trigger Framework, with the important
            distinction that only a subset of the events were sent to the L1.5
            Trigger, and only a subset of the Level 1 SpTrg could be enabled to
            send their events to the L1.5 Trigger.  In Run II *all* events will
            always go through the Level 2 System.

        4) Level 2 Trigger PreProcessor (L2CTPP)
            In Run I we had the L1.5 CT with similar functionality.  All
            messages were actually parsed and processed by TCC.  A memory block
            with a binary data structure was prepared and shipped to the L1.5
            hardware.

        5) Level 2 Trigger Global Processor
            In Run I the L1.5 CT partially filled this functionality

            The Run II method of programming the L2 CTPP and the L2 Global are
            still being worked on, but it seems like there will still be ASCII
            messages and a third-party translator program (on the host or TCC)
            to fill a binary data structure that can be read by the L2 CTPP or
            L2 GLobal processors.

            These two systems (L2 CTPP and the L2 Global) are not *directly*
            discussed here.  But we should at least *try* to make the  L1, L2
            and L3 programming interfaces *similar*, or compatible, if not
            truely identical.