/ - Board Testing Home - // - Overview - // - Tests Without the Test Board - // - Tests with the Test Board - // - Pre-Burn Assembly - // - New Tests After Burn-In - // - Final Assembly - /Overview
Preface
A word about ESD (ElectroStatic Discharge): ESD kills. Boards. Much like cardiovascular disease in humans, it can kill slowly, over time, unnoticed until it's too late (if a component fails once it's in the test machine, there is a possibility every 6 months or so of replacing it; in other words, if some part of a board fails once it's in the machine, there is next to nothing that can be done to fix that.) Always handle the boards by their edges and don't touch any of the components until you've grounded yourself by touching some conductive ground on the board. This is actually easy, since there are many grounds: below each of the input connectors there is a large ground jumper, the soldered pads surrounding the mount holes on the board are grounded, and so are the coverings of the SCSI and LEMO connectors. More on those later.
When inserting or removing components (e.g., the clock or a mezzanine card) be sure that the power is off, otherwise these components can be damaged.
Board Functionality and Layout
The test procedure is essentially the same for all of the ALCT (Anode Local Charged Track) boards. The difference is what files are loaded into the EEPROMs (Electronically Erasable Programmable Read-Only Memory), the number of chips on the board, and thus the amount of work that has to be done in testing them.
The purpose and function of the boards is to gather data from the AFEBs (Anode Front-End Board) and CFEBs (Cathode Front End Board) in the form of raw signal hits, filter out the good stuff from the bad, and send it up the processing chain to the next board. The signals are taken up by the 40 pin black plastic array of connectors that occupy much of the board's real estate. Only 16 out of the 40 pins are used to carry data, and therefore the number of inputs on the boards are integral multiples of 16:
- Small boards process 288 inputs: 18 input connectors
- Medium boards process 384 inputs: 24 input connectors
- Large boards process 672 inputs: 42 input connectors
From the number of inputs comes the naming scheme of the boards and
filenames. Notice that the connectors on the board are numbered starting from 1. This is important because the test program on the computer numbers the connectors from 0. So, computer connector number + 1 = board connector number.
The boards have to acquire data at 40MHz, so that is their clock speed.
Data input to the boards is first run through a delay chip. The purpose of
the delay chip is to set a HIGH signal (delayed by a preset amount) on its
output pins if the signal input to it reaches a certain threshold. The
threshold and delay time are both set by the Slow Control controller chip
on the board (Xilinx Spartan XCS40XL FPGA), which sits below the input
connector array on the underside of the board and is programmed by an
EEPROM sitting directly on top of it. In turn, the slow control chip
programs the delay chips through a serial bus that runs perpendicular to
the signals from the inputs and outputs. Since each connector takes 16
inputs, so does the delay chip, and that is also the number of its
outputs. These output signals are then run into a multiplexer which runs
at 80MHz and collects data from two delay chips and relays it to the
mezzanine card. If you look at the back of the board, you can see the delay
chips next to the input connectors and a multiplexer for every two
right above them.
This data is then fed into the mezzanine card sitting in the middle
of
the top half of the board. Contained on the card is a Xilinx Virtex FPGA (Field Programmable Gate Array) which is programmed by one or two EEPROMs also located on the mezz card:
Mezz Card Details for Different-Size Boards
| Board Size: | 288 | 384 | 672 |
| Virtex Chip Type: | 600 | 600 | 1000 |
| No. of EEPROMs: | 1 | 1 | 2 |
It's the job of the Virtex chip to sort out the data and send the good stuff to the next higher-up processing board through the SCSI bus connectors.
Above the Virtex chip, along the edge of the board from left to right (if the board is oriented so that the Virtex chip is superior) are the SCSI, JTAG, LEMO and power connectors.
- Power: The board takes 1.8V, 3.3, and 5.5V. During testing
in the lab,
power is provided by two high-precision power supplies. The plug coming from the power supplies can fit only one way into the socket on the board, so if it's not going in, try rotating it. It is possible to burn out the board if the connector is inserted the wrong way, but that would take a lot of force and you're smarter than that.
- Note that output on the power supplies can be en/disabled
individually using the "OUTPUT" button on the top right of the faceplate.
When power is enabled, the LED (light emitting diode) above the button
glows red. Remember this button when one of the tests isn't working and the
board's LEDs show no power on one of the rails.
- LEMO: "LEMO" is the name of the company that makes them. They're Swiss. That's all I know.
- JTAG: From http://www.tuxscreen.net/wiki/view/JTAG: "JTAG stands for "Joint Test Action Group" It is a hardware method of talking to memory and flash without requireing any app running on the hardware. In other words: mess up your bootloader and your device is a brick? Use JTAG to install a new working loader." This is used to program the EEPROMs.
- SCSI: "Small Computer System Interface" The boards use a newer LVD
(Low Voltage Differential) standard to communicate with other boards.
These ports are also used during testing. LVD is used because signal HI/LO
is determined by current which stays the same over long distances, unlike
voltage (used by the IDE bus, for instance) which drops. This means that
the boards can be connected over relatively long distances, e.g., several
feet.
Below the left edge of the JTAG connector is the clock ON/OFF jumper
and the blue clock socket. The pin closest to the JTAG connector (topmost
pin) is pin 1, the next one down is 2, and the last one is 3. When 1-2 are
shorted (a jumper -- little black plastic connector with gold sheet inside
-- is placed over them) the clock is enabled. When 2-3 are shorted,
nothing
happens, and the clock does not run. During testing, a relatively expensive clock is used, so to protect and reuse it, it's inserted into a blue tray exactly like the one on the board, and this tray in turn is inserted into the board. It does not matter how the clock is oriented in its support tray, but the assembly must be inserted into the board so that the notch on the clock lines up with the little diagram printed on the board below the clock socket. The clock is made by EPSON and is slightly larger than a jumper.
Preparing a Fresh Board
- If you're setting up a large board, check that there are two wires soldered on the back: one to the left of the mezz card and one near the bottom middle.
- If these aren't there, find another board to work on.
- Check that there's a strip of textured brown/gray tape on the board above the bottom row of LEMO connectors.
- If not, apply a strip about 4in long.
- Attach two black plastic sleds of the right length per board size
along the underside of the board using the 7/16" wood screw found in bin #9
of the screw tray. Aim to use every other hole in the sleds. You'll
probably want to use the electric screwdriver for this. Hold it parallel to
the axis of the screw, apply pressure to keep the bit from coming loose and
stripping the screw, and don't stop until the screwdriver clicks.
- The sleds provide support to the board and reduce flexing during shipping and when connectors are inserted/removed.
- Fill the four holes around the mezzanine slot with support rivets:
- Insert the rivet (bin #5) so that the flat top is resting on the top side of the board.
- On the other side, add on a bumped washer (bin #6) and a nut (bin #7).
- Tighten the assembly with the hex nut driver.
- Assemble the catch tabs for the JTAG connector:
- Find two of the gray connector tabs for the JTAG connector on the
assembly table. They're usually sitting in a petri dish. If not, look under
the table for them. Also find two small metal cylinders which will fit into
the holes of the board's JTAG connector. These act as locking pins.
- Assemble the tabs into the connector by placing the tab into the
connector and pushing a locking pin through their common hole. A pen with a
flat cap is useful for this. Repeat for other side.
- Find and insert a mezzanine card
- There is usually a box of good mezzanine cards sitting on the big table below the shelves holding components.
- The small and medium boards use mezz cards with a Virtex 600 chip on them. These mezz cards have only one EEPROM (small square chip near the top left.)
- Large boards use mezz cards with a Virtex 1000 chip and two EEPROMs.
- To insert the mezz card, line it up by feel with the connectors on the board, and press down firmly on the card without touching any of the chips, connectors, or pressing down on the unsupported top edge. You will hear and feel the sound of 800 pins making solid contact when the card is firmly seated. It sounds a little bit like footsteps in snow.
- For future reference, now is a good time to mention how to remove a
mezz card. Find a set of four thumbscrews that fit into the large holes
found at the cardinal points on the mezz card (the same points that lie
over the rivets you just inserted, in fact, this is why they are inserted
in the first place.) Tighten the thumbscrews in small
amounts until the mezz card pops free on its own.
- Enable the debug LEDs on the top right by shorting jumpers 1-2 on the three-jumper block found below the top right mounting pad. This block is sometimes labeled "LED_OFF" and/or "SW2".
- Obtain a clock and a jumper. These should be in a little tray on the
worktable.
- Program the Slow Control by following the upcoming instructions.
To program the board, the clock must be shut off and a connection has to be established to it:
- Disable the clock by shorting pins 2-3 or taking the jumper off entirely.
- The clock can also be en/disabled while power to the board is on. By only disabling the clock and not cutting the power some time can be saved during testing when it's necessary to reprogram the EEPROMs.
- Connect the JTAG connector.
The board is programmed across the JTAG interface coming out of the computer's LPT (Line Printer Terminal, AKA parallel) port and going into the JTAG connector on the board. On the computer, the ALCT_Test (icon is a microchip) program is used to select what chip should be programmed.
- "Setup" tab
- "Select X-Blaster Channel" either "Slow Control Programming" or "Virtex Programming", depending on whether you are setting up a board (former) or performing the testing procedure (latter).
- "Set Chain"
Now is a good time to interject a note about ALCT_Test. Whenever you
want the program to talk to the board, you have to open a connection to it
by hitting "Open" in the "Setup" tab, then select the function you want
to access:
- Slow Control Control: Set automagically by the tests that need it,
this setting is primarily used in testing the delays, since they are
controlled by the slow control on the board.
- Slow Control Programming: What it says.
- Virtex Control: Also set automagically by the tests that need it,
this setting lets you talk to the Virtex chip and tell it what to do.
Useful for doing things like the single cable test.
- Virtex Programming: This one's fairly intuitive too.
And then hit "Set Chain" to carry-out your selection.
iMPACT is used to actually program the chips:
- Launch iMPACT (icon is a little computer connected to a board) and follow the default choices:
- Operation Mode Selection: Configure Devices: Next >
- Configure Devices: Boundary-Scan Mode: Next>
- Boundary-Scan Mode Selection: Automatically connect to cable and identify boundary-scan chain: Finish
- The first chip that always comes up is the FPGA on the board, but there's no point in programming that because when the power goes off, all data is lost on the FPGA. Therefore, hit Cancel and go to the next chip.
- Select the required .mcs file(s):
Choosing the right .mcs file(s)
| 288 | 384 | 672 |
| Slow Control Programming | slow_control3.mcs* |
| Virtex Programming: Single Cable Test | d24_288.mcs | d24_384 | d24_67200.mcs, d24_67201.mcs |
| Virtex Programming: Remaining Tests | test_288.mcs | test_384.mcs | test_67200.mcs, test_67201.mcs |
| Post-Burn ALCT Script and Self
Tests | alct288.mcs | alct384.mcs | alct67200.mcs,
alct67201.mcs |
Again, the naming convention is clear. Note that since there are two
EEPROMs on the large boards, they both have to be programmed and they use
different filenames: _00 is for the first chip in the chain, and _01 is
for the second. The boards ship with the alct___.mcs firmware
installed on them.
When programming the slow
control, iMPACT requests that you give it a part name. This will be
"xc18v01_vq44".
- Select the chip(s) to be programmed and start the program either by right-clicking on the chip and selecting "Program" or up on the menu bar, go Operations: Program.
- On the large boards, you can select and program both chips simultaneously by holding down Ctrl when selecting the second chip and using the menu bar command to program the chip.
- Surf the web/get coffee/chill; the large boards can take a couple of minutes to program.
- To load the new firmware into the FPGAs, cycle the power by pulling the power plug and reinserting it.
- When starting from scratch, you can save a few steps by programming
both the Slow Control and Single Cable Test Virtex firmware before cycling
the power. "Select X-Blaster Channel" as "Virtex Programming", select the
right firmware, and you're golden.
- It's also useful to know when a mezzanine card is programmed:
- There will be ~3.4V across the GND and DONE pads of the mezzanine card, which can be found next to the EEPROM on the top side.
- When the Single Cable firmware is programmed, LEDs D4, D5, D6 and D8 light up faintly. You may have to block the light from the ceiling to see them.
- When the firmware for the remaining tests is programmed, LEDs D2 and D24 light up. It is possible that there is an error on the board such that the mezz card is correctly programmed but one or more of these lights don't light up. This is a problem with the board's circuitry and needs to be reported.
To check if the slow control is programmed (useful when you're testing a
board you didn't start testing from scratch), go into ALCT_Test, "Slow
Control" and hit "User ID Check". If it comes up as a bunch of 0s or Fs, then it hasn't
been programmed.
Troubleshooting
And by "Troubleshooting", I mean "The board appears to have failed to
program and
here's what might've gone wrong."
- Ensure that the board is getting the right power:
- All LEDs below the power connector should glow bright red. This
means that the board is getting power.
- If they don't, check that the power supplies are enabled.
- Wiggle the power connector: are there cables inside the connector
loose?
- Flex the board gently with your fingers around the power
connection. If power seems to come on sporadically (check by looking
at the LEDs on the board) then something is
broken inside the board and it may be reparable.
- Was the board power cycled after programming?
Testing Procedure
Notes
Boards to be tested are usually sitting in a pile behind the plywood board racks. Each board has its own info sheet that is used to keep track of where it is in the test process. These sheets are all in a blue binder sitting on top of the plywood racks. When testing a board, the sheet is taken out of the binder and successfully completed tests are signed off with the tester's initials. When testing is finished, the sheet is returned to the binder. If a board doesn't have a sheet, start a new one for it. Blank sheets can be found in the back of the binder.
If a board fails a test, make a note of what happened and where in the
test process the board failed in the error log. The error log is a stapled
packet of sheets also sitting somewhere on the plywood rack that has
columns for the board #, mezz #, error type, and repair status. Your job is
to fill out the first three of those columns. If a board seems to fail a
test, always ask someone more experienced to take a look at it first
before filling out the error log. Sometimes the error is only minor, and
you can ignore it and go on with the other tests (this is especially
useful if you've already spent 15 minutes hooking up the board to
something and there are other tests that need to be done in this state.)
Otherwise, it may be possible to fix it on the spot. But most likely
you've simply glossed over a necessary step in the test.
Before you begin the tests, get a blank info sheet and fill in the ALCT
Board #, Mezz Board #, and Delay Type. The delay type is written on the
delay chips on the underside of the board. Beware that the board may have
mixed delay types so it's not enough to look at one chip.
There are two sets of tests: without the test board, and with it. These
instructions are written with the assumption that the board's state changes
only so much between tests as is specified herein. In other words, the
instructions for a test may not necessarily ask for the clock to be enabled
or the mezz card to have a certain firmware programmed on it if these
things should already been in place if the instructions and tests were done
in order.
/ - Board Testing Home - // - Overview - // - Tests Without the Test Board - // - Tests with the Test Board - // - Pre-Burn Assembly - // - New Tests After Burn-In - // - Final Assembly - /
Matt Matolcsi (madhat@ucla.edu); Last revision: 2003/07/16