/ - Board Testing Home - // - Overview - // - Tests Without the Test Board - // - Tests with the Test Board - // - Pre-Burn Assembly - // - New Tests After Burn-In - // - Final Assembly - /
With the Test Board
Notes
The test board's purpose in life is to support testing of the ALCT
boards. It's a fairly large board with a total of 42 cables coming out of it. Some
of these connections will be (not might be) broken, so some of the
tests have to be completed manually using an oscilloscope. Only two test
stations have test boards. If you're sitting at a station without one,
congratulations, you can start on another board now.
When dis/connecting cables between the boards, neither one should be
powered. Take extra care to unplug the test board after you're done using
it because there is a specific buffer that gets very warm and burns out at
an accelerated rate, causing the board to fail. This buffer can be
replaced but it costs time and money that's better spent elsewhere.
Plugging in the test cables is probably the trickiest and most
time-consuming part of this job, so here is a suggested guide. (No one's
saying you have to follow it, but then, no one says you have to _____
either..., it's just a good idea, you know?)
For purposes of this explanation, the assumed orientation of the boards
will be the test board standing upright with its power connector superior
and the tested board lying on the table with its SCSI connectors facing
away from the test board and the black signal connectors closest to the
test board. In this orientation, the SCSI cables run from the test to the
tested board without crossing each other.
On the test board the connectors are numbered from J0 to J41, J0 =
bottom left, J1 = bottom right, up one row, J2 = left, J3 = right, up a
row, J4 = left, J5 = right..., zig-zag style (or retrofit fire-escape
ladder style a la "Westside Story" logo, if you like) all the way up. The
trick is to start connecting cables from the middle until you know what
goes where. The bottom left connector of the test board goes into the
first connector of the upper half on the tested board, and the bottom
right connector goes into the connector immediately below the midline on
the tested board, AKA, just below the one you just plugged in. Now fill
the top half of the tested board upwards, going up the left side of the
test board, and fill the bottom half of the tested board downards from top
connector of the first half again by going up on the test board. If you
did everything right, there will be an equal number of connectors on each
side of the test board and no empty slots on the tested board. Since we're
physics and not math, here's a handy chart:
Determining Where to Start
| Board Size: | Small Board | Medium Board | Large
Board |
| Bottom connector of upper
half | 10 | 13 | 22 |
| Top connector of lower half | 9 | 12 | 21 |
Troubleshooting
Sometimes it will be the cables connecting the test boards and not the
test board that is broken. Remove the suspect cable from the board and
test it using the cable tester built by Kristjan. If the cable is bad, you
can replace it with a good one from the box of good cables in the back of
the room labeled "Good cables", and leave the bad cable in the box labeled
"Bad cables". The cables in the "Bad" box are periodically fixed and moved
over to the "Good" box.
If the cable is fine, and you've also tried another good cable, the
problem is most likely on the test board. These sorts of failures tend to
be permanent and are the reason that some things will be broken
forcing you to do an old-fashioned manual test.
SCSI Test
Notes
This tests the input/output of the SCSI connectors.
Procedure
- Connect only the SCSI connectors between the two boards and
power them on.
- On the tested board, jumper the second pair of pins on the upper
jumper block, TP1_30 and
TP1_31.
- Jumper the last pair of pins in the same block, TP1_0 and TP1_1. Pins
D4, D5, and D6 should now glow without blinking. Blinking = problem.
- Reset the test board by shorting jumpers 19 and 20 on the test
board's upper right jumper block. The LEDs on the tested board should go
dead.
- Remove and reinsert the jumper over TP1_0 and TP1_1 on the tested
board. The 3 LEDs that lit up before should light up the same way again.
Troubleshooting
- Is there a clock in the tested board? There shouldn't be.
- No other jumpers should be set on the tested board (with the possible
exception of clock jumper on 2-3, disabling the clock.)
- Is the JTAG connector hooked up? It shouldn't be.
- Are the SCSI cables between the test and tested board reversed? Using
the above orientation, they shouldn't cross.
Delay Automatic Manual Test
Notes
This tests the ability of the delay chips to... delay. I think.
Funny story! It's called the Automatic Manual Test because it used to
be done entirely manually, and then a Russian guy found a way to do it
automatically and called it the Automatic Manual Test.
Procedure
- Connect the SCSI and test cables between the boards, and hookup the
JTAG connector. I'll be here when you get back. I promise.
- Open a connection to the board in ALCT_Test, go to "Delays Check",
and hit "Self Test".
- If everything comes up without any errors, which it never does,
you're home free to do the next test. Errors would be listed in the
program log that scrolls by in the bottom pane of the program window. What
you're generally looking for is that all of the delays come back as 30
± 4ns.
Troubleshooting
This test can have a bundle of things go wrong with it, so first check
that you followed the instructions both in this test and the tests before
it. Since that would take a lot of effort, though, I'll rattle of a few
common mistakes I've made:
- Is the clock enabled? Since the test board provides the clock, it
doesn't matter whether or not there's a clock in the board, as long as
it's not enabled. When the board's clock is enabled, the test board's
clock is disabled, regardless of whether or not there is a clock inserted
into the tested board.
- Is the upper row of jumpers, nearest to the LEDs, completely empty? It
should be.
- Is there power going to both of the boards?
- Is the JTAG connected?
- Is the slow control programmed?
- Are the SCSI cables crossed?
- Are the test cables inserted in the correct order?
- When all these fail, try the following:
- The Martin-Cable-Switching-Trick: Take the lower SCSI cable
and flip it so that the end that used to be in the test board is now in the
tested board. If that doesn't work, try doing it on the other cable, or do
both and save time! I've seen this trick work miracles. It works because
the SCSI cables are also flaky and moving them around sometimes brings
broken connections inside to see the light.
- Cut the power to the tested board, then the test board. Now close
ALCT_Test. Hell, shut the PC off. Reboot. Open up ALCT_test. Give the
boards juice in the reverse order of which it was cut. Repeat test. This
trick works sometimes because the JTAG driver on the PC is buggy and gets
corrupted occassionally. By "occassionally", I mean, "We could narrow down
when the driver gets corrupted pretty well but even if we did, we wouldn't
have the resources to fix it."
- There is a rumor that the test boards can't supply enough current to
the big boards when all the cables are plugged in. I've never seen this
problem, but it may be worth doing the test in upper and lower halves. This
also makes it easier to plug in all the cables.
- If all this fails, or you know that a certain connector on the test
board always fails (and hence the problem is on the test board), you will
have to perform the following manual test.
The Manual Automatic Manual Test
Notes
Actually, this is probably how the test was done back in the olden
days, before they had test boards and such. For all we know, they generated test impulses by banging rocks together.
Procedure
- This test is done with the test board completely disconnected from
the tested board. Therefore, unless you enjoy plugging giant messes of
cables in, it is suggested that you leave this test for last after you've
done all the other tests that have to be done with the test board.
- Setup the oscilloscope: trigger on ch. 2, looking at the signal on
ch. 1. The signal happens over a relatively brief timespan so the time
division will have to be low: around 40ns, I think. You can adjust the time
division by rotating the knob under the "Horizontal" column: clockwise for
smaller ("opens up" a signal on the scope), counterclockwise for the
opposite.
- Plug the terminator into the channel being tested. This will probably
be one of the channels that failed the previous test. The terminator looks
like an ordinary test cable, with a metal-plate protected structure on one
end. For bonus points, get Martin to prononunce "terminator".
- Connect the JTAG, obtain and enable a clock, and give the board
power.
- In ALCT_Test, go into "Slow Control" and under "Delays Test" enable
"Test Delays" and then disable it. This gives us the oscillating output on
the LEMO connector that's plugged into ch. 2 on the scope which we trigger
off.
- Using the fine probe of the oscilloscope, drag the tip across the 16
output pins (facing away from the black connector) one by one, looking for
a signal that goes jumps back and forth by about 30ns.
- Using the Cursor function of the oscilloscope, determine if the signal
is 30 ± 4ns wide for each of the pins.
- Since the oscilloscope's screen freezes for about a second when it
doesn't get a signal instead of just flatlining, you have to go slowly over
each of the pins to see whether or not there's a signal. Be very careful
when doing this because it's happened that a pin failed to have a signal on
it but passed the test anyway because the operator went too quickly and
didn't notice it.
- Move the terminator over to the next connector to be tested and repeat
the test as necessary.
Troubleshooting
- There's no signal whatsoever? OK, this one's the worst.
- Is the terminator in the right connector? Remember that the
connectors are numbered from 0 in ALCT_Test and from 1 on the tested board.
- Wiggle the probe around. It may be dirty or just not making good
contact.
- Are the Virtex and Slow Control programmed with the right firmwares?
Is the JTAG connected? Is the clock in and enabled? Are all the other
jumpers off the board except the clock and LED enabler?
- When all else fails, reboot. Everything. This tends to work very
well.
- There's what looks like a flat molehill moving left-right on the
scope but no definite signal? You're making contact with two pins
simultaneously. Pick up the probe and try again.
- There's what looks like a bunch of jagged signals but nothing moving back and forth in time? The scope may be setup to look in the wrong time domain. Set the amplitude scaling of the channel to 1V/mark and look to the left and right of where you're looking by using the horizontal position button. Since you're looking for a signal, it's obviously a good idea to keep the probe on the delay chip pins as you're doing this.
- There's a signal but it's not going back and forth in time: You're
triggering on ch. 1 probably, but you need to be triggering on ch. 2.
"Triggering" means that the scope is looking for the signal on an input to
go past a preset threshold (and usually in a specific direction, e.g., up,
or down) and when this happens, it starts drawing the screen. When you
trigger on ch. 1, the scope triggers when the signal rises, not at a
specific point in time, so the signal will appear to stay steady. Channel
2, however, is coming off the LEMO connector on the tested board and always
drops at the same time (that is, the period between drops is the same),
letting you see the change in the delay in ch. 1's signal.
- How do I use the cursors on the scope? Adjust them so that the start
of the rising edge and the end of the falling edge of the signal fall
within the cursor bars. The time delta will then be given on the scope in
the upper right-hand corner.
- The rising and falling edges aren't well-defined and I'm not sure if
the signal passes? Connect the probe's ground to a ground on the board.
This will greatly clarify the signal's edges.
Standby Test
Notes
Martin, what does this check?
Procedure
- Short the 3rd jumper from the bottom in the upper block of the board,
TP1_28 and TP1_29.
- In ALCT_Test, under Slow Control, under "Thresholds-Standby Test"
check "Standby" and go
through all the channels, checking that the LEDs in the the top right of
the board light up in pairs. If there's a blinking red light, or they don't
light up in pairs, there's a problem.
- Remove the jumper when you're done and uncheck "Standby".
Troubleshooting
- Is the board connected to the test board? And they both have power?
Try resetting the power on both.
- There was one setup I saw that repeatedly failed every board on
channel 5, until the board being tested was simply power cycled, after
which everything worked fine.
- You did short those two pins I told you to short, right?
- And the clock's disabled, right?
Thresholds Test
Notes
I'm sure this tests something critical.
Procedure
- Setup the scope to trigger on ch. 2 and observe the signal on ch. 4,
which comes in from the test board. A useful time division is 10μs.
- ALCT_Test >> "Slow Control", under "Thresholds-Standby Test" hit "Go"
- Look for the signal on ch. 4 to bob slightly up and down constantly
and to periodically go from flat to a sudden drop followed by an
exponential rise and overshoot and back again to flat. You'll know it
when you see it, it's a bit difficult to describe. It's important that this
signal look clean, i.e., there's no sine-waviness or excessive in
it.
- Go through the channels at a pace you see fit.
- If you're familiar with the test and have a good idea of how it works,
you can have the computer increment the channels for you by checking "Chan
Loop".
Troubleshooting
- No clock or other jumpers set, right?
- Both boards have power?
- If the signal appears to be skittering about, check that you're
triggering on ch. 2.
- There are a series of spikes on ch. 4, but nothing looks like what it
should? You probably have the wrong firmware programmed.
- The signal changes too slowly, and eventually stops changing
altogether? This is probably a bug in ALCT_Test or the JTAG driver it uses.
Reset the computer.
Now setup the board to be burned. If the board has already been burned
(verifiable either on the burn-in checklist or the board's info sheet) go
on to the post-burn tests without disconnecting the test board.
/ - Board Testing Home - // - Overview - // - Tests Without the Test Board - // - Tests with the Test Board - // - Pre-Burn Assembly - // - New Tests After Burn-In - // - Final Assembly - /
Matt Matolcsi (madhat@ucla.edu); Last revision: 2003/07/17