From: JIML::LINNEMANN "Jim Linnemann at MSU (517)355-3328" 8-AUG-1997 17:40:10.74 To: EDMUNDS CC: Subj: revised version of requirements Here are the pieces of functionality we would like from Magic Bus and/or Fred A. Event Input data: flows at 10 KHz can be substantial volume: 1-20 KB/event, giving 200MB/s worst case buffering for 16 events on input up to 12-14 sources of event-type data B. Output Event data: [only for calorimeter crate] flows at 10 KHz on same MBus heavily loaded for input data volume smaller: less than 1KB/event typical, maybe X 5 less Comes split among 2-3 sources, each a separate alpha card [target is in another crate, so each source need not be event-synchronous with other source, nor strictly interleaved with reading of event, ie sequence can be IOIIO, not restricted to IOIOIO] [planning to write output data L3 for Global and Cal preprocessor on VME] C. Communication data: The alpha processors need to communicate [across MBus] several times per event. Because of long VME latency with the D0 VBD readout controller, this looks impractical to do in VME, at least in the Global crate. The messages are 5-20 B long, and need latencies of order 1 usec. D. Fast Monitoring The alpha cards would like to be able to set and have respond in not too many instruction cycles some lines which represent the state code of the internal state of the node, and the number of buffers currently occupied on the card. This is a minimum of 8 bits (4+4), which could be received, decoded, and scaled (at 132 ns) in the trigger framework. For flexibility we suggest the ISA (Fred) port have 32 bits, with at least of these bits bidirectional. These same lines would be very handy for Logic Analyzer debugging. How we might implement these on (our understanding of) MBUS/Fred. It would have been nice if - MB Sources provided 24b of address and PCI/MB Bridge (Bridge) passed along lower 16 bits - Alpha cards could source and sink with 24b of address at no worse than half of full MB speed - provided alpha cards can write to each other and other MB devices at reasonable speed, having 12 b of address space as broadcast can probably substitute adequately for the full flexibility Below are some more detailed comments on what we would like MBus to do, under the assumption that the Bridge only sees 8 (or better 12) bits of address reading, but can source with a larger address range. 1. We have 16 sources and 16 addresses associated with these sources. This can be encoded as 1 Byte of address interpreted as Higher and Lower half-bytes. The source is responsible for providing to MBus its [full 16-bit] address information, which is used in the DMA engine to point data towards particular memory blocks. The location within these blocks is determined strictly by sequential address of data within transfer, assigned by the DMA controller counter. The DMA controller switches blocks when it notes a change in 16-bit address. * [When it switches, does the location count revert to 0, or is this the job of the CPU when resetting for "next event"? Are there separate counters for each of the 256 addresses?] 2. Event Input data: flows at 10 KHz can be substantial volume: 1-20 KB/event, giving 200MB/s worst case buffering for 16 events on input up to 12-14 sources of event-type data All event-type data is seen by MBUS as write cycles into Alpha boards from MBus Tranceiver (MBT) cards. We will probably have 2 MBT cards of 8 sources each in the crate. [how do we daisy-chain these to produce "end of event"? [whose responsibility is it to produce "end of event"? MBus termination? MBT card (daisychained across front panels? on MB lines?] * Assume that any MB source slot can assert any address, and that physical slots, not addresses, are used for arbitration * If so, does one MBT card have to have all its sources finished before relinquishing bus mastership 3. Output Event data: [only for calorimeter crate] flows at 10 KHz on same MBus heavily loaded for input data volume smaller: less than 1KB/event typical, maybe X 5 less Comes split among 2-3 sources, each a separate alpha card [target is in another crate, so each source need not be event-synchronous with other source, nor strictly interleaved with reading of event, ie sequence can be IOIIO, not restricted to IOIOIO] Send out on MBus to particular target addresses on MBT cards * assumes Alpha card bridge can WRITE to MBUS, not just read from it * speed need not be as fast as input, but not slower than 10X? * need standard MBUs arbitration logic on Alpha card * assume that this is a message to a particular location on MBus, so that MBT card can be set up to be the only listener interested in message to these addresses. Assume that the listener on the MBT card can be set up as either a FIFO (constant address) or a memory (Alpha source changes MB address). Which, if either of these can the Alpha card support? 4. Communication data: The alpha processors need to communicate [across MBus] several times per event. Because of long VME latency with the D0 VBD readout controller, this looks impractical to do in VME, at least in the Global crate. The messages are 5-20 B long, and need latencies of order 1 usec. Sources 14-15 are alpha-alpha communication. Any of the CPU's can describe itself as source either. There are 16 messages supported on this channel; a message is defined by which CPU [if any] have their DMA engines set up to transfer that source to a memory buffer; if not, it is ignored. If we do need one address [here effectively 6-bit: 14/15 + 4b for "event"] per message, and must keep them distinct, that might force us to want 12b addresses instead of 8b. Source 15 messages should be able to get bus mastership between sources of an event, as the communication is not synched to an input event, and we do not wish to accept the latency of waiting for the end of an input event. * Assumes that MBT cards allow arbitration between their sources, not the full half-event. But if so, what is to prevent downstream card from seizing mastership if an Alpha card is not ready to do so? 3. Here are the presently-known messages which pass in a simple worker-administrator crate for a global processor: a. Worker sends to Admin the buffer number of an event, the 128 bit answer for a 16-bit event number, and if it passed, a list of start addresses and wordcounts for data to be read out It then stalls until the reply message b: b. Admin replies by giving (perhaps with event number and buffer number) the buffer number to use when an event comes into the mapper slot occupied by the present buffer number. c. Admin sends the 128 bit answer to HW Framework by sending on MBus; this message goes to an output module on a MagicBusTransceiver (MBT) card. [is an acknowledge message required?] [switch select on which card this is active] d. [Via MBUS???] an interrupt must be generated for EACH alpha card in the crate when the a) Event Finished is raised by the [last] MBT card b) the DMA engine has finished transfer of this event into Main Memory [ is an acknowledge from Workers to Admin's required? If so, one per worker?] [ if multiple workers: - there seems no mechanism to send to a DIFFERENT address inside the predefined block of memory, so again may need to be distinct messages -if need ack's from two workers, would they be separated in time until Admin has time to digest the message? -instead, need to go to different memory blocks!!? e. [Via MBUS??? Fred??] send GO to MBT card[s] for next input event f. [Cal Preprocessor Crate] Jet worker writes Jets to L2Global by sending to MBT [switch-enabled]. [ack required?] g. [Cal Preprocessor Crate] Em Worker writes EM to global. h. [Cal Preprocessor Crate] MET worker?