Reply to H-Clk Review Comments
          --------------------------------


                                            Rev. 24-Oct-2011

Thank you for studying the H-Clk review material and for the
review comments.  I will first answer what I believe to be
the 2 main points that I see in the comments.  I will then
try to answer all the detailed points that were brought up.


What I believe are the 2 main points:
-------------------------------------

> ... why do it in VME ...

   A year ago when this design started VME was the format
   that I understood people wanted the GPS timestamp and
   trigger control functions in.

   I thought that the control software folks wanted one path,
   i.e. VME,  to talk to all things:  the GPS timestamp system,
   the trigger control function, the setup of the TDC and
   scaler cards and then the event readout of these cards.

   At one time there was interest in directly reading out
   from the H-Clk card on a per event bases.  The cleanest
   way for folks to keep the H-Clk reads and the TDC reads
   aligned in events  was to make the H-Clk a VME card.

   I worry about making a system that needs to run for some
   years with high uptime out of lots of little demo cards all
   set out on a card table with lots of wires running around
   between them and all of this running from some wall-wart
   power supplies.    I, like everyone else, build temporary
   test systems that way  but not online production systems.

   In my experience supporting remote electronics systems
   there is a significant advantage in being able to replace
   things at the module level.  The folks who are on site
   often do not know the details of a given system.  It can
   be very convenient to ask some one to replace a VME module
   vs  fly down to repair a specialized setup of demo and
   prototype cards.     I have done both   :)

   My understanding is that folks may/will want to run
   the GPS timestamp and/or trigger control functions at
   a number of sites.  Thus making a real VME module  vs
   making a number of sets of demo cards    sounds
   like the right way to go.


> ... why a Virtex II FPGA ...

   Because I have a stash of them from a previous large project,
   i.e. the 10% spares that were purchased to support that
   project and never used.  They are free to the H-Clk project.

   They are more than large enough and fast enough to implement
   any rational version of the GPS timestamp and/or the trigger
   control functions.

   I'm not currently responsible for the GPS or trigger control
   firmware but if I get called on in an emergency to help with
   this then this family of Virtex is the part that I have the
   most existing firmware for that is similar to the functions
   that are needed for HAWC.

   In my experience I have found that whatever version of FPGA
   tools you start with,  you must preserve that software for
   the duration of the experiment anyway,  otherwise you can
   not even rebuild exactly the firmware that you are currently
   running, let along make controlled changes to it.  Whether
   you start with the current cut from Xilinx or a couple of
   versions back is not the main point.  Keeping that software
   on a stable running computer for the life of the experiment
   is the main issue.  This is an especially important point
   with experiments that run for decades.  Doing the firmware
   development on a stable operating system is an important
   choice.  Building the firmware with scripts  (and not by
   clicking things in gui menus)  is another important point
   if "N" years from now you want to be able to re-make
   exactly the same part as you made this afternoon.

   If folks think that it is important to have a current part
   on the H-Clk card - I'm happy to do that.  I don't see it
   as a big advantage but I'm happy to do it.  Just let me know
   what part you want put on it.


Detailed Points:
----------------

> To me it seems that this is a conceptually simple system,
> 10MHz in -> 40MHz out, and make some logical decisions based
> on a few inputs.  Then output lots of lines in various formats.

   Yes, I too think that this is and should be a simple  straight
   forward system.  To me that's one of the advantages of
   splitting the GPS time stamp  and  trigger control/fanout
   functions onto two separate cards.  These are separate
   functions so make them separate in the implementation.
   That helps keep things simple and clear so that folks can
   understand how it works and maintain it as requirements
   change.


> It seems to me the easiest/cheapest implementation of this
> involves a commercial FPGA eval board, level shifters, and
> a clock generator.  This would all run on its own power supply
> and be interfaced via USB or Ethernet.

   Yes, the H-Clk card is basically just the "level shifters,
   and a clock generator"  part of what you describe.
   I thought that people wanted and there were advantages to
   the VME module format as I described above.


> All of the functionality of the TDC is embedded in this system,
> is it really needed to be an external unit? Basically I'm saying
> if this is in VME, no TDC is needed, if a TDC is used, then this
> might as well move out of VME to save a crate.

   I'm sorry but I probably do not understand this question.

   Are you asking, "Why TDC readout of the timestamp ?"
   That is what was suggested/requested by the managers.
   Some of the assumed advantages were: assured event alignment
   with the PMT TDC data, assured ability to readout at the
   same rate as the CAEN TDCs and fit in with any special
   VME things that CAEN may be doing.  I believe that it was
   generally thought to be lower risk than separate direct VME
   readout of the timestamp.   We did try to sell the idea
   that H-Clk would readout the timestamp  directly over VME
   but there were no buyers.


> The presentation of the design materials looks quite unusual
> to me and it was difficult to separate log-like entries from
> design-entries.

   I thought that folks were basically familiar with this
   project so I did not take the time to make a cleaned up
   presentation for this review.  What I dumped on the web
   is basically  all  of my notes running back over the past
   year.

   During that time the requirements have changed a lot and
   thus the cards to implement the GPS timestamp and trigger
   control functions have changed.  Some of my notes contain
   a lot of history.

   In my review materials announcement note I tried to provide
   a reader's guide,  i.e.  "start with the file:
   h_clk_card_design_points.txt".

   This file describes the functions that are currently on
   the H-Clk card  and  has no old history in it.


> The mezzanine board needs justification.

   Lots of folks who design FPGA based systems for HEP find
   it useful and cost saving to put the FPGA on a mezzanine
   card.  This obviously is not the right choice if GHz serial
   signals are involved but for something like the H-Clk card
   functions there is no electrical problem in doing it.
   It isolates the BGA assemble and the only somewhat dense
   and tight design rule part of the over all project to
   the small mezzanine card.  The main 6U x 160mm VME size
   card is a very relaxed inexpensive 6 layer card.  By using
   the FPGA mezzanine I have a nice clean quiet layout of
   the main card.


> Why not use the FG456 prototype board instead?  There's a
> lot of work in there that might not be needed.

   On other projects I have used commercial FPGA prototype
   cards.  To get what I wanted for this design I rolled
   my own.


> Does this follow the recommended breakout routings?

   This is embarrassing but I really do sit up at night
   reading books on BGA breakout strategy.  This is not
   my first or anywhere near hardest BGA or FPGA design.
   I have a well read and annotated copy of the Xilinx
   breakout application notes.


> Some of the mezzanine routings look quite odd to me,
> especially near the connectors

   Please describe for me where you see any screwed up routing.
   If I have messed something up I need to fix it.

   Note that the 400 main connections to the Mezzanine are
   to pads on the bottom of the card.  That may be why the
   routing looks strange to you.  This is a zero insertion
   force connector setup that we have had no trouble with
   on other systems.

   All of the signal routes should look direct and basically
   radial out from the FPGA foot print to the connector pads on
   the bottom near the perimeter.   Global Clock Net pairs have
   been extra carefully routed and isolated and are the shortest.

   As I said, these are draft gerber files.  I have not run
   a final set of design rule checks and clean up passes.
   We typically tighten the design rules until something
   fails and then try to fix that and loop this until you
   hit the wall.  I.E. make it as easy to manufacture with
   100% yield as possible.   Nothing is tight unless it
   inherently needs to be tight.


> I didn't see a single, unified BoM or design rules document
> (former as spreadsheet)

   This design is not finished and thus the final Bill of
   Materials is not available.  I'm happy to make it available
   to you when it is ready.

   If you have a question about a specific part I'm happy
   to answer it.

   One reason that the BoM is not in final form is that I
   may make a number of substitutions before it goes out for
   build depending on what parts we have in stock.  We have
   lots of SMD part spools on hand.  For may things,  e.g.
   LED series resistor, CMOS pull-ups   a wide range of values
   would work perfectly well.


> Virtex-II choice rationale? EOL on these? Eval boards are
> gone for sure from OEM market.
> $221 from avnet, others are in the secondary market now

   Please see my comments on this above.  If you don't
   think that a Virtex II part can easily do these simple
   functions then I'm happy to change it.  No problem.


> If I interpret the documents correctly, each line of plain
> text needs to be correctly implemented in order to have a
> properly working board, is that right? Or is there a different
> cross-check?

   I'm not certain what you are asking about.  Is this
   a question about the net list for the H-Clk card ?


> The documentation on the Mez_456 seems to have new design
> rules as it goes along "0.20mm traces with 0.13mm clearance"
> etc. Is this from some source?  Documentation from the
> Virtex-II line?  Eagle or OrCad-derived rules? Rules of thumb?

   Yes.   The design rules change as you get out from
   under the BGA.  We do the extra work to widen traces and
   spacing  (i.e. enforce an easier to manufacture set of
   design rules)  once you are out from under the FPGA.

   The design rule for a given trace segment, via, pad,
   hole, fill, whatever  depends on what electrical set
   of nets it belongs to, where it is, what layer it is
   on, ...    We are not constrained to enforcing just
   one design rule,  e.g.  0.15mm traces and 0.15mm spaces
   or something like that.

   It makes the card harder to manufacture if you have
   tight tolerances everywhere instead of just where you
   actually need them.


> Okay, on the H-Clk general block diagram, not clear why
> VME is used at all, makes for expensive connectors, and VME
> interfacing which appears to all be in the FPGA

   Please see my initial general comments.
   The connectors to the VME backplane do not
   dominate the cost of this card  (about $20).


> VME IP exists and is in hand for the Virtex-II?
> Or built up by hand?

   We have previously build VME interface firmware in 3
   different families of Xilinx parts including Virtex.  

   As it stands this is simple - slave only  A24D16 only.
   You can do that in the morning before coffee.

   I'm not doing the firmware for this project but if
   I need to help then the VME part will not be a problem.


> Why not a standard VME interface chip?

   I'm not certain what specifically you had in mind,
   e.g.  a full Tundra VME interface chip,  something
   like the old Cypress VME interface stuff,  something
   like the Texas VMEH22501 chips  ?

   Basic answer - none of these specialized parts are needed,
   you gain nothing from using them in a simple slave only A24D16
   only setup,  it's simpler without them,  even with them you
   still need to do parts of VME interface yourself anyway.


> Why your own PLL? Why not some Silicon Labs or IDT chip that
> does this? That would also be able to do the slower clocks for
> a triggered "all data" setup.

   I have a very good understanding of how to design and
   what performance we will get from a narrow range quartz
   oscillator based PLL.

   I have successful experience building clock generators that
   are the time base for system that multiply up from them  e.g.
   as the TDC will.

   I  (along with others at Fermi)  have bad experience with
   using the "octave tuning range parts" such as mentioned above
   as time bases for systems that multiply up form them.

   As you know its trivial to generate integer lower frequency
   clocks in the FPGA and lock them to edges of the 40 MHz
   clock in the FPGA's I/O Blocks.  You don't need to add a
   little demo clock card from Silicon Labs or IDT to do this.

   I look at how serious time base people  make their serious
   clocks.  Octave tuning range parts are not in their designs.


> No specs that I can see on the 40MHz clock jitter relative
> to the 10MHz clock

   It will be better.

   I  (or you if you want to)  can measure it by running 2
   H-Clk cards against each other.


> Continuing on that, do we have an Allan variance for the 10MHz
> from the NavSync? How does that slew on correction? If we
> substituted a rubidium clock disciplined by the GPS, would the
> jitter reduction (at medium time-scales in particular) survive
> this PLL?

   If we needed to maintain accurate time/frequency without
   the GPS signal for periods of hours then a Rubidium oscillator
   would be a standard choice to provide this "hold over".
   As far as I know GPS outage at the HAWC site is not an issue.
   As far as I know there is no requirement for HAWC to run
   without the GPS being up.  As it stands, without discipline
   the 40 MHz signal from the H-Clk card will be within
   some ppm of the correct value and the accuracy of the
   timestamp will drift accordingly.

   The measurements at HAWC, i.e.  pulse widths and relative pulse
   arrival times,  are "short term" time scale  measurements.
   In the "short term"  (perhaps a second or less)  a good quartz
   oscillator gives less jitter than the Rubidium.

   The timestamp absolute accuracy of 1 usec is straight forward
   to achieve with GPS clock.

   One of the NavSync manuals in the following directory
   discusses the typical absolute accuracy of this GPS receiver
   http://www.pa.msu.edu/~edmunds/HAWC/Manuals/
   gps_navsync_cw46_tim25_timing.pdf 

   > How does that slew on correction?

   As the 10 MHz output from the GPS receiver moves either ahead
   or behind its idea position it will be advanced or retarded
   in a step of 5 nsec to put it closer to its idea position.
   The resulting chance in phase of the 40 MHz PLL output will
   take place over a period of about 10 msec.


> The GPS interface has become specialized to this particular
> GPS unit, not general purpose (backup battery etc.), most
> general would be 50ohm antenna in with bias-tee and a NMEA
> input I'd think

   The idea is that any generic GPS receiver could be used
   with the H-Clk card.  Any GPS receiver that makes 10 MHz,
   1 PPS, and NMEA ascii strings could be used.  Those are
   the normal signals that you get from a GPS receiver.

   The H-Clk card can provide battery backup to keep the
   GPS receiver running when there is no VME crate power
   if that is needed/wanted.

   The biasing for the active antenna is in the receiver module.


> The 10 & 40MHz low jitter clocks should be in good strip line or
> coplanar waveguide geometries (I see a calculation for this but
> no additional detail)

   They are differential strip line.

   Where they are near anything else, they are on a private
   layer  (while still holding the overall card to 6 layers).


> Simulation of the circuit layout?

   I'm not certain whether you are asking about simulation
   of the circuit design  or  simulation of the printed
   circuit board layout ?

   The only part of the H-Clk circuit design that one might
   simulate is the PLL - but that can actually be solved in
   closed form.  Its operation is then confirmed on the real
   H_Clk card by monitoring the PLL's filter output  (available
   on the J1 "Access" connector)  while using a test signal
   from an HP generator as the 10 MHz reference input with
   either FM or phase modulation on this test reference signal.
   In this way you can characterize and verify the operation
   of the PLL.

   In this frequency range the pcb itself does not need
   simulation.  All clock signals are differential and properly
   terminated.  The longest trace runs are perhaps 5 inches
   or only about 0.017 wavelengths at 40 MHz.  The separation
   between differential trace pairs is about 3x the separation
   of a differential pair.


> EMC plan? I'd assume this is normal practice in HEP. IEC?
> CISPR?  Something custom?

   I did not plan to have the H-Clk card tested at an external
   certification company for electromagnetic compatibility.
   The power in the signal levels on this card is low.  All
   single ended traces on the card are short compared to the
   signal frequency and thus these traces are inefficient
   radiators.   All external cabling is differential and thus
   not a strong radiator.  There are no switching power
   supplies or other high power components on this card.


> The line driver-GPIO interactions aren't detailed

   Interactions between outputs will be small.  All outputs
   switch on 40 MHz clock edge.  There are few simultaneously
   switching outputs in either application of this card - less
   than 20% of the I/O pads in the worst case.  There are 128
   evenly distributed grounds to balance the 160 GPIO signals.


> Nothing on firmware, nothing noted on utilization estimates
> for the FPGA

   From previous experience with FPGA design its hard to imagine
   that any rational implementation of either the timestamp
   function or the trigger control function could require more
   that 25% of this part.  The trigger control should not
   actually require more than a few percent.


> FPGA selection criteria?

   Please see the initial comments at the top of this note.


- Who do you plan to have the BGA work done?

   We typically work with 3 or 4 different assembly companies.
   Any of them should be able to build this card.  We have had
   hundreds of these 456 pin BGA parts assembled before with
   zero failures.  On a well designed pcb, with proper surface
   treatment and a properly designed solder paste mask and
   application process - 456 pin BGA on a 1mm grid is no longer
   a challenging part to assembly.


> Rework capability inhouse?

   We can only rework small pin count SMD parts in house, e.g.
   the Rs and Cs and the 8 pin GPIO driver/receivers chips on
   the H-Clk card.   All bigger parts we send out to a company
   that does this kind of work full time.  They do a very nice
   job and it is fast and not expensive,  e.g. $25 to lift and
   replace a 400 pin part.  For as rarely as we need to do it,
   it's not effective to try and bring that kind of "big part"
   rework in-house.


> Conformal coating?

   Considering the signal levels on this card and the locations
   where it will be used, I do not know of any reason to coat it.


> If you were handed a Xilinx eval board and a clock generator
> chip, would you still use the designed architecture, or
> something different?

   As far as I know we have to build something to match the
   signal level, cable, and connector requirements of the
   timestamp and trigger control functions.  A plain Xilinx
   demo card will not do it.  The intent was to make a card
   that would fulfill all of the currently known requirements
   for these functions in a generic way so that we would
   not need to redo the card as the requirements continue
   to evolve.

   I would strongly consider using a demo card if:
   there were only one of these systems, it was only going
   to run for a month or two, and it was close enough so
   that I could easily get there to repair it.


> What is done with the clock if there isn't a GPS fix?
> A GPS failure leaves the experiment not triggering?

   With zero satellites in view the 40 MHz will be within
   some ppm of the correct value and the timestamp will
   drift accordingly.

   Once the GPS receiver has completed its initial power
   up survey and fixed its location then it can maintain
   a time lock even when only one satellite is visible.


> I haven't addressed this too much, but I think for
> future-proofing...

> Jitter suppressing clock architecture so a better
> clock feeds through

   This is exactly what the 40 MHz PLL does.  In the short
   term the 40 MHz crystal oscillator has less jitter than
   the GPS clock but in the long term the crystal oscillator
   would drift wrt the absolute time.

   At time scales greater than 10 msec  (or wherever the
   optimum crossover is)  the 40 MHz is locked to the GPS
   10 MHz reference.

   This is jitter suppression - it is made explicit so
   that you know how it works and can set it up to run
   in an optimum way for our application.


> Be able to operate without GPS lock using free-running
> oscillator

   As designed, the system will continue to run without
   a GPS lock.  TDC timing accuracy,  i.e. pulse width
   and relative pulse arrival time measurements will remain
   within some ppm of their correct values.  Timestamp
   accuracy will drift from the correct value by this amount.

   If we need the system to be able to run without a GPS
   lock and maintain its specification of 1 usec absolute
   timestamp accuracy - then we need to change the design.
   There are a number of ways to hold the 1 usec timestamp
   accuracy for a day or so with no GPS signal.
   Should we include this as a new requirement for H-Clk ?


> Support timing requirements such that the clock can be
> encoded into a single channel if needed

   I too worry about this from a couple of points of view:

   1. How do we best use the CAEN TDCs.  We know that the TDCs
      can "loose data" at a couple of different points in their
      buffering system.   A small loss in the PMT TDC data may
      be OK.  It will be a bigger problem if some significant
      fraction of the timestamps are screwed up.

      With the CAEN TDCs, which is more likely to screw up:

       - 2 edges every 20 or so usec on each of 32
         adjacent TDC channels

       - 64 edges every 20 or so usec on one TDC channel
         (with the adjacent TDC channels doing ??)

      Do we know the answer to this "data loss" question yet ?

   2. What if we ever want to inject the timestamp into
      each TDC module.  If we ever want to do this then
      it will be necessary to use just one or a couple
      of TDC channels to accomplish it.   I'm concerned
      about this because if I were the all powerful king
      I would dump the 100 miles of PMT coax and put
      electronics at each group 15-30 tanks.

   Recall that reading out the timestamp via a TDC was something
   that was suggested by Brenda and then seconded by Andy.  It's
   kind of klutzy system but it is a setup that has a couple of
   nice features and as Andy said, probably has lower risk.
   I can fill this in if you want.


> This implies using a >1GHz PWM (for example) output,
> is this supported?

   With 1 GHz were you thinking about making 1 nsec pulses ?
   1 nsec may be a bit short,  i.e. only  5 counts from the
   200 psec resolution TDCs.   The standard twist and flat
   cable is not very good at carrying 1 Ghz signals - especially
   for any distance.

   If we need 32 bits to make a timestamp  and  if we need to
   deliver a timestamp every 20 usec then wouldn't starting a
   new bit every 200 nsec and making the bit either 25 nsec or
   50 nsec long be fast enough ?   That we can do with just
   the current 40 MHz clock.

   If we actually need it then the clock manager in the
   currently specified FPGA can multiply the 40 MHz clock
   up into the 300 MHz range.

   In any case by building a shifter in the FPGA and directly
   clocking its I/O blocks one can make a clean serial timestamp
   signal at either 40 MHz or 300 MHz.  I don't think that we
   need an explicit external Pulse Width Modulator for this
   function.  Note that the currently specified GPIO drivers on
   the H-Clk card have 500 psec edge times and are specified
   for use up to  (and somewhat higher than)  200 MHz.


> The trigger control is all firmware I believe

   Yes, all of the trigger control logic is in FPGA firmware
   on the H-Clk card.   All of the: conversion to/from ECL,
   signal fanout, and grouping of signals into "Control Buses"
   is on the CB-Fan card.


> Much of the time stamp as well

   All of the timestamp generation logic is in FPGA firmware.  
   The H-Clk card directly delivers 2 copies of the timestamp
   (32 bits each)  at the correct signal level and cable/connector
   format so that they will plug directly into the CAEN TDCs.


> Does an outline of the firmware exist?

   Yes, this was written by Udara and Tilan and is on the web at:
   http://hawc.pa.msu.edu/gtc/GTC_System.pdf


> Is that compatible with the hardware chose? Hardware driven by
> firmware requirements?

   If rationally implemented, I see nothing in either the
   timestamp or trigger control functions that would not be an
   easy speed and space fit into the currently specified FPGA.
   In a totally irrational implementation - who knows.
   We have much bigger and more complicated systems running
   in these parts.


> The CB-Fanout, as I understand it, has low functionality other
> than level conversion and fanout   This seems like a safe division

   Yes, the idea is that H-Clk is LVDS only and has 160
   generic GPIO signals.  The only special function on it
   is to make and distribute the clean low jitter 40 MHz
   based on the 10 MHz GPS reference.

   CB-Fan is specific to the CAEN TDC - HAWC  Control Buses,
   but its functions are simple:  convert  LVDS <--> ECL  as
   necessary, fanout, and group signals into Control Buses.
   The intent is that it is a fast easy straight forward design.


> Do you have the BoM for the stuffing house?

   No, the final Bill of Materials is not available yet
   but I'm happy to discuss the choice of any of the
   parts on the card.   There is nothing very special
   on this card and basically nothing that we have not
   used before on other designs.


> I'd think there were far easier ways of future-proofing the FPGA
> end of the design   In any case, this mezzanine would require a
> new board to be spun for a different FPGA

   The real purpose of the mezzanine is to make the layout of
   the main H-Clk card easier and cleaner - especially the
   clock generation and distribution part.  But yes, moving
   to a new FPGA would require only a rather easy new MEZ-456
   layout using the existing design as a seed.


> Were the reference breakout design documents followed?
> If not, why not?

   Basically yes, please see the comment about this above.
   Where I differ its basically in the direction of making the
   electrical characteristics more conservative and the pcb
   manufacturing and assembly tolerances easier to meet.


> ...this design (review) was presented far too late...

   I completely agree.  This review is too late in the
   process if the intent of the review was to change the
   implementation of the timestamp or trigger control
   functions.   I believe that the basic VME / FPGA
   implementation of these functions has been discussed
   in EVO meetings and presented at the 3 previous HAWC
   group meetings.


> ...the likely costs of this design do weigh on my comments...

   The cost to HAWC should be relatively low.  Except for the:
   connectors, GPIO driver/receiver chips, raw printed circuit
   boards, and assembly   I believe that I have basically
   everything else "in stock" left over from other builds.
   All of these parts will be free to HAWC.  As you know one
   either has to use up left over spools of SMD parts or they
   end up in the dumpster.  If we want any kind of modular
   scheme with reasonable cabling and connectors then these
   parts that HAWC will need to purchase  will be needed no
   matter what the design.


> ...The proposed approach is likely to be ultimately successful...

   That's certainly my intent   :)


          Thanks,   Dan