Configuration fault recovery

Yannick Lamarre · May 16, 2017

Hi all,
I've been thinking about this problem for a while and shared it with a few colleagues, but no one has yet to come up with an answer.
For some configuration, an FPGA can be configured so that two different drivers are connected on that same line internally. A practical example would be two BUFGs driving the same line on a Spartan6.
If those two drivers are driving a different value in a CMOS process, it will connect both rails together on a low impedance line. Obviously, this will cause damages to the chip.
Now the question is: How long can it stay in this state before it breaks?
An easier starter question: What is likely to break first and how?
The follow up to all of this is, can we design a current-limiter/cut-off circuit fast enough to prevent destruction of the chip?

Regards,
Yannick Lamarre

BobH · May 17, 2017

On 05/16/2017 01:15 PM, Yannick Lamarre wrote:

Hi all,
I've been thinking about this problem for a while and shared it with a few colleagues, but no one has yet to come up with an answer.
For some configuration, an FPGA can be configured so that two different drivers are connected on that same line internally. A practical example would be two BUFGs driving the same line on a Spartan6.
If those two drivers are driving a different value in a CMOS process, it will connect both rails together on a low impedance line. Obviously, this will cause damages to the chip.

I don't think that the tool chain will let you do that. There are
several steps that should be able to catch it and error out. This is
assuming that you are using a "mature" tool chain.

Try manually instantiating two drivers to the same clock line and run it
through the tools. It may disconnect one for you or it may just refuse
to complete. If it automagically disconnects one for you, it may take
some real digging in the log files to find it, but I think it will just
error out.

BobH

Yannick Lamarre · May 17, 2017

On Tuesday, May 16, 2017 at 5:59:27 PM UTC-4, BobH wrote:

On 05/16/2017 01:15 PM, Yannick Lamarre wrote:
Hi all,
I've been thinking about this problem for a while and shared it with a few colleagues, but no one has yet to come up with an answer.
For some configuration, an FPGA can be configured so that two different drivers are connected on that same line internally. A practical example would be two BUFGs driving the same line on a Spartan6.
If those two drivers are driving a different value in a CMOS process, it will connect both rails together on a low impedance line. Obviously, this will cause damages to the chip.

I don't think that the tool chain will let you do that. There are
several steps that should be able to catch it and error out. This is
assuming that you are using a "mature" tool chain.

Try manually instantiating two drivers to the same clock line and run it
through the tools. It may disconnect one for you or it may just refuse
to complete. If it automagically disconnects one for you, it may take
some real digging in the log files to find it, but I think it will just
error out.

BobH

Hi Bob,
You are skipping the mental exercise here. What about if some cosmic rays toggle the configuration bits so that the scenario happens? Highly possible in space. This is why there is a market for SEU controllers/monitors and the likes. Now, back to the drawing board.

BobH · May 17, 2017

On 05/17/2017 08:40 AM, Yannick Lamarre wrote:

On Tuesday, May 16, 2017 at 5:59:27 PM UTC-4, BobH wrote:
On 05/16/2017 01:15 PM, Yannick Lamarre wrote:
Hi all,
I've been thinking about this problem for a while and shared it with a few colleagues, but no one has yet to come up with an answer.
For some configuration, an FPGA can be configured so that two different drivers are connected on that same line internally. A practical example would be two BUFGs driving the same line on a Spartan6.
If those two drivers are driving a different value in a CMOS process, it will connect both rails together on a low impedance line. Obviously, this will cause damages to the chip.

I don't think that the tool chain will let you do that. There are
several steps that should be able to catch it and error out. This is
assuming that you are using a "mature" tool chain.

Try manually instantiating two drivers to the same clock line and run it
through the tools. It may disconnect one for you or it may just refuse
to complete. If it automagically disconnects one for you, it may take
some real digging in the log files to find it, but I think it will just
error out.

BobH

Hi Bob,
You are skipping the mental exercise here. What about if some cosmic rays toggle the configuration bits so that the scenario happens? Highly possible in space. This is why there is a market for SEU controllers/monitors and the likes. Now, back to the drawing board.

You are correct, I was assuming it was a design flaw.

To your original question, I suspect that a rail to rail short through a
couple of FETS would be very hard to detect in in generalized way from
the current signature. When a large circuit like a major clock
distribution changes state, you will get a significant current spike,
probably not unlike what you would see at the beginning of the short
circuit situation. With the short circuit, that current will persist
until something craters (unless the drivers had some kind of foldback
current limiting). That seems like it might be detectable, until you
consider what would happen if something like a bunch of relatively
static GPIO signals driving external loads (maybe optocouplers @ 20ma
each) transition from off to on simultaneously. The destruct current for
the clock driver is probably less than the normal current signature in
this case.

You MIGHT be able to make a current signature analysis work in highly
specific cases, but false tripping would be a serious problem.

The power supply decoupling capacitors are going to make detecting fast
current spikes difficult externally. You might be able to monitor the
voltage drop across the power supply bond wires in the package or
internal distribution system to estimate current flow without adding
sense resistance as a way to sense current after the decoupling caps.

In a previous job, I worked on hot swap power controllers. These chips
were supposed to deal with the inrush current of charging the bulk
capacitance on a board as it switches on, but shut down if the current
got too high or the inrush persisted too long. The only way we could
prevent false tripping was to set the thresholds and delays a lot higher
than you would expect. When they work, they work well. You can short out
a 100 Amp 12 Volt rail with a pair of pliers, and it will switch off
before the power supply over-current's and shuts the whole cabinet down.

I think detecting configuration changes would be better done through
redundant LUTS or some similar method. You might even be able to
implement that in an existing FPGA via the tool chain. This would not be
the standard vendor type tool chain, but a specialized one. Developing
this tool would be a good PHD project for someone.

This is pretty much speculation on my part, and I am not going to claim
to be an expert on high rel stuff.

Good Luck,
Bob

Theo Markettos · May 18, 2017

Yannick Lamarre <yan.lamarre@gmail.com> wrote:

If those two drivers are driving a different value in a CMOS process, it will connect both rails together on a low impedance line. Obviously, this will cause damages to the chip.
Now the question is: How long can it stay in this state before it breaks?
An easier starter question: What is likely to break first and how?

Assuming you managed to defeat all the protections and turn on both
transistors, I don't think it will be that bad.
The transistors are sized such that they can achieve a suitable slew on the
capacitance they will have to deal with. It might be a long wire, but
on-chip the capacitance will be fairly small (guess: single pF or less)

Applying a simple T=RC with 1pF and a time constant of 1ns, the resistor is
1K. Short two of those in series and you have 2K across the power rail.
If the rail is 1.2v, that's 600uA, or 720uW.

I don't think anything is going to cook with that.
Maybe it would be bad if you managed to short a thousand of them, but
it would take some effort to procure the cosmic rays.

Theo

Configuration fault recovery

Yannick Lamarre

Guest

BobH

Guest

Yannick Lamarre

Guest

BobH

Guest

Theo Markettos

Guest

Welcome to EDABoard.com

Sponsor

Online statistics

Forum statistics

Configuration fault recovery

Yannick Lamarre

Guest

BobH

Guest

Yannick Lamarre

Guest

BobH

Guest

Theo Markettos

Guest

Log in

Welcome to EDABoard.com

Sponsor