TECHNICAL DETAILS

Real time testing allows measuring the non-accelerated soft error rate from the following sources:

  • atmospheric radiations:
    1. neutrons (with acceleration factor due to location and elevation)
    2. protons (with acceleration factor due to location and elevation)
    3. natural thermal neutron (depends on the test environment, not the altitude)
  • intrinsic radiations:
    1. alpha particles from packaging

Unlike accelerated test, where one contributor is accelerated to a level such that the contribution of the others can be omitted, the result of non accelerated tests is a mix of all the contributors.
Thermal neutron effect (if relevant), can be filtered out using proper shielding (like a calibrated cadmium foil or silicon rubber sheet containing boron-nitride). Given that the altitude acceleration factor applies only for natural cosmic rays and not for alpha particles, another real time test should be performed to separate the effect of alpha from cosmic and thermal neutrons: cave testing or sea level (different distribution of particles flux than at high altitude).
A cave test allows testing only for one contributor: alpha particles. If test is performed on the same test setup than for mountain testing, results can then be compared and subtracted to single out the contribution of each source of error.

Test setup
The design of the set-up should ensure that all detected errors come from the devices under test, not from external or system noise.A specific debug PCB is usually manufactured, holding 3 DUT, to confirm that the tester access properly the device.Test setup, including tester, is shielded or using high reliability, low noise components.
We provide real time monitoring of the experiment remotely through Ethernet access, as well as remote reset function. Status report is provided to the customer on a monthly basis.
High temperature can have a negative effect on Soft Error rate and on Latch ups. The setup itself dissipates heat, therefore the temperature is regulated for the setup. Note that usually other experiments are present in the same lab room, which can also be affected by heat.
For Real Time SER tests, iRoC uses a PC to monitor racks of DUT boards through a dedicated SER tester. The figure bellow gives an idea of the test configuration. All these equipments have a very low rate of errors.
UPS, power supply, PC, are dedicated to the experiment, for the overall duration. Therefore there will be no interruption of the test other than momentary issues that are fixed in few hours. The PCBs are manufactured for the purpose of the test and are property of the customer.

Control of DUT is performed through a dedicated tester that will start and read all the memories in the rack every given time interval.If the tester detects an error, it sends to the control PC all the information concerning this error: timestamp, chip number, failing address, data word expected and data read.
The design of the set-up ensures that all detected errors come from the devices under test, not from external or system noise, by following these guidelines:

  • The test operation is such that once a failing data is detected, the data is read again a given number of times, 10 times is a minimum, before data is rewritten. Consistency of failed data over 10 read cycles ensures that the failure is a soft error in the DUT.
  • The tester implements high reliability system techniques: redundancy in the logic interface with the DUT and watch-dog for periodic re-initialization of the tester.
  • The power supplies are designed for uninterruptible operation and very low noise. The power supply voltages are permanently monitored. The voltage and current drawn by the DUT during the standby mode are logged periodically in the PC.
  • The PCBs are properly designed for very low internal noise. The PCBs are multi-layered with alternated signal and ground planes for high immunity to EMI. The PCBs are designed with controlled impedance for maintaining signal integrity with a relatively high number of circuits in a bus. The DUT chips are connected in daisy-chain with a limited number of chips in each bus. An FPGA is controlling the read and write operations for each PCB.
  • The tester and array of DUT boards are properly shielded against EMI.

Test strategy
For Real Time SER testing, the goal is proving that the average SER is below a certain specified value, and that this hypothesis is true with 95% confidence level. The number of errors happening drives the strategy of the statistical test. Every time an error occurs, we can correlate with integrated flux received to get a FIT number. We will end the test when the number of errors will be statistically significant and consistent. In order to be meaningful, we will be targeting a minimum of 10 errors to happen. If the 10 soft errors occur before the calculated duration, we can draw the conclusion that the target value of FIT is reached with more than 95% confidence.
Alpha particles and cosmic rays are two independent sources of SER. Therefore we can state that :
        Number of error = Number of error (alpha) + Number of error (cosmic)

It is interesting during a Real Time SER test to separate the contribution of cosmic rays and alpha particles, like it is done for the accelerated test with high flux neutron sources and alpha sources. This point is very important in designing the right test strategy.