Approach to Finding the Root Cause of Failures

On Tue, 31 Mar 2020 08:34:34 -0700 (PDT), blocher@columbus.rr.com
wrote:

Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

One thing that helps to find intermittents is temperature testing. If
you temp test new designs, you'll have a lot fewer bugs later.

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com
 
On 3/31/2020 11:34 AM, blocher@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Whoops!! 2nd try!!

With all the above being typed and read I have a much simpler way
to look the problem.

Just use the the "Not Method of Troubleshooting".

The Not Method goes like this.

It's Not this!!
It's Not that!!
Once you have identified all the Not's, the only thing left
is not a not, but is the real problem.
Fix it or replace and move on!!

Now I am sure someone will find fault with my method, well Ok then!!
Some days the Not's just have to be adjusted.

Have a good day!!

Les
 
On Tuesday, March 31, 2020 at 3:11:03 PM UTC-4, John Larkin wrote:
On Tue, 31 Mar 2020 08:34:34 -0700 (PDT), blocher@columbus.rr.com
wrote:


Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......


One thing that helps to find intermittents is temperature testing. If
you temp test new designs, you'll have a lot fewer bugs later.

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

I do not think I have ever designed something (not counting testing items and a few R&D jobs) that has not required temperature operation from -55C to +70C. I never really thought of that until now.
 
On Tue, 31 Mar 2020 12:18:02 -0700 (PDT), blocher@columbus.rr.com
wrote:

On Tuesday, March 31, 2020 at 3:11:03 PM UTC-4, John Larkin wrote:
On Tue, 31 Mar 2020 08:34:34 -0700 (PDT), blocher@columbus.rr.com
wrote:


Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......


One thing that helps to find intermittents is temperature testing. If
you temp test new designs, you'll have a lot fewer bugs later.

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

I do not think I have ever designed something (not counting testing items and a few R&D jobs) that has not required temperature operation from -55C to +70C. I never really thought of that until now.

-55 is severe. Did you test beyond the required range? A lot of things
are first-order compensated for temperature.

A lot of timing/race conditions are temperature dependant. One guy we
worked with used the wrong clock edge to strobe ADC data into an FPGA
at 250 MHz. THAT was temperature dependant!

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com
 
On 3/31/2020 11:34 AM, blocher@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......
 
On Tuesday, March 31, 2020 at 3:04:56 PM UTC-4, John Larkin wrote:
On Tue, 31 Mar 2020 14:55:02 -0400, ABLE1 <somewhere@nowhere.net
wrote:

On 3/31/2020 11:34 AM, blocher@columbus.rr.com wrote:

Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Whoops!! 2nd try!!

With all the above being typed and read I have a much simpler way
to look the problem.

Just use the the "Not Method of Troubleshooting".

The Not Method goes like this.

It's Not this!!
It's Not that!!
Once you have identified all the Not's, the only thing left
is not a not, but is the real problem.
Fix it or replace and move on!!

Now I am sure someone will find fault with my method, well Ok then!!
Some days the Not's just have to be adjusted.

Have a good day!!

Les

That's the Sherlock Holmes technique. It doesn't work very well. The
list of NOTs to test is too big, and you are unlikely to include in
the list the things you missed when you did the design.

The NOT technique is a last resort. (it's how I found a
leaky toggle switch... we had a bag of leaky
switches, most circuits didn't care if there
is a few meg ohm of resistance.) Before you pull all
your hair out, you pull all the components out and replace 'em.
But how do you know the replacement component is good!
Quickly a knotty nightmare.

I find it best to get as much data as possible,
and then sleep on it*. When you think about it 'actively'
you tend to get stuck in your first assumption rut.
(And if your first assumption had been right, it'd be
fixed/found already. :^)

George H.
*or go explain the problem to someone else... not that they will
be able to help (well they might) but because having to explain it
makes you go over the whole circuit and may remind you of the part
you haven't been thinking about.
--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com
 
On Tuesday, March 31, 2020 at 3:43:05 PM UTC-4, John Larkin wrote:
On Tue, 31 Mar 2020 12:18:02 -0700 (PDT), blocher@columbus.rr.com
wrote:

On Tuesday, March 31, 2020 at 3:11:03 PM UTC-4, John Larkin wrote:
On Tue, 31 Mar 2020 08:34:34 -0700 (PDT), blocher@columbus.rr.com
wrote:


Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......


One thing that helps to find intermittents is temperature testing. If
you temp test new designs, you'll have a lot fewer bugs later.

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

I do not think I have ever designed something (not counting testing items and a few R&D jobs) that has not required temperature operation from -55C to +70C. I never really thought of that until now.

-55 is severe. Did you test beyond the required range? A lot of things
are first-order compensated for temperature.

A lot of timing/race conditions are temperature dependant. One guy we
worked with used the wrong clock edge to strobe ADC data into an FPGA
at 250 MHz. THAT was temperature dependant!

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

We typically do not test below -55. Things are usually pretty good to -40 then the last 15 degrees can be squirrely. Most of the parts are specified to -40 (no surprise). They still work at -55 just unknown parameters can fall out of spec. On the hot side we frequently run the units to 90 degrees or hotter to find find failures to see what breaks first. 85 is not usually too bad.

At the moment we have a design where the SPI bus is not reading correctly at +70C. It was a copy of another design that we completely validated at +70 degrees. that is how corner cases work.
 
On 2020-03-31 14:40, Rick C wrote:
On Tuesday, March 31, 2020 at 12:41:36 PM UTC-4, David Brown wrote:
On 31/03/2020 17:40, blocher@columbus.rr.com wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Also - the FPGA guys and the SW guys will only acknowledge a problem when it is laid out under their nose. It is never their fault :)


That's because it's usually a hardware fault - and it can be solved by
using a bigger capacitor :)

You laugh, I once used a telephony part that had a PSRR of 0dB which I had missed. (Who expects 0 dB?) On the customer's work bench they were getting noise in the audio that turned out to be from the DSP power consumption. They were using clip leads to provide power to the UUT and the on board capacitance wasn't enough to mitigate it. We told them to use better power connections and also used a larger cap.

0 dB of PSRR??? How can you even do that exactly??? CP Clare, what a piece of work they are. The other CP Clare part had a problem that virtually made it unusable, but they didn't point it out in the data sheet. I wonder if they actually use engineers or if they just let high school kids design their ICs?

Are you quoting that WRT the input or the output? PSRR and CMRR are
normally quoted input-referred, i.e. to find out the effect you have to
multiply by the overall gain.

There are lots of parts that can have negative-dB PSRR as referred to
the output.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com
 
On 31/03/20 19:55, ABLE1 wrote:
On 3/31/2020 11:34 AM, blocher@columbus.rr.com wrote:

Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is
rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure
than I was when I was doing more design work.  In many ways I think it is more
challenging than design work.  It takes a mindset that is  different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test.  Root for not
being fooled by the results of your test

2. Assign weighting factors to everything you believe.  Never assign a
weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an
opposite conclusion when you repeat a test than what you concluded after the
first test.

4. Taking guidance from "helpful" outsiders is challenging.  On the one had
they care and are smart, on the other hand if you go about chasing other
peoples ideas (often conceived of to just demonstrate they are concerned in a
meeting) you will never get an a clear path to troubleshoot the problem in
your own way.
Help is a two edged sword.   It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design
phase, I no longer look at that as a curse, but as a blessing.  It is going to
come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad.  As
a designer you can show a days work for a days pay.  In root cause you feel
like you have accomplished nothing for a long time.  Frequently, though ,
these problems are the most visible problems in an organization and can make a
difference between losing a customer and keeping one.

7. Look for contradictions in your thinking.  Use other people to help you
find contradictions in your thinking.

OK - enough for now......

Whoops!!  2nd try!!

With all the above being typed and read I have a much simpler way
to look the problem.

Just use the the "Not Method of Troubleshooting".

The Not Method goes like this.

    It's Not this!!
    It's Not that!!
    Once you have identified all the Not's, the only thing left
    is not a not, but is the real problem.
    Fix it or replace and move on!!

Now I am sure someone will find fault with my method, well Ok then!!
Some days the Not's just have to be adjusted.

Ah, the Sherlock Holmes technique.

Fails dismally because people's imagination is finite and the
number of "Nots" is infinite.
 
On Tue, 31 Mar 2020 13:21:39 -0700 (PDT), blocher@columbus.rr.com
wrote:

On Tuesday, March 31, 2020 at 3:43:05 PM UTC-4, John Larkin wrote:
On Tue, 31 Mar 2020 12:18:02 -0700 (PDT), blocher@columbus.rr.com
wrote:

On Tuesday, March 31, 2020 at 3:11:03 PM UTC-4, John Larkin wrote:
On Tue, 31 Mar 2020 08:34:34 -0700 (PDT), blocher@columbus.rr.com
wrote:


Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......


One thing that helps to find intermittents is temperature testing. If
you temp test new designs, you'll have a lot fewer bugs later.

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

I do not think I have ever designed something (not counting testing items and a few R&D jobs) that has not required temperature operation from -55C to +70C. I never really thought of that until now.

-55 is severe. Did you test beyond the required range? A lot of things
are first-order compensated for temperature.

A lot of timing/race conditions are temperature dependant. One guy we
worked with used the wrong clock edge to strobe ADC data into an FPGA
at 250 MHz. THAT was temperature dependant!

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

We typically do not test below -55. Things are usually pretty good to -40 then the last 15 degrees can be squirrely. Most of the parts are specified to -40 (no surprise). They still work at -55 just unknown parameters can fall out of spec. On the hot side we frequently run the units to 90 degrees or hotter to find find failures to see what breaks first. 85 is not usually too bad.

At the moment we have a design where the SPI bus is not reading correctly at +70C. It was a copy of another design that we completely validated at +70 degrees. that is how corner cases work.

We had an Analog Devices SPI ADC that was flakey with temperature.
They were no help. We spun the board and used a TI part.

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com
 
On Tuesday, March 31, 2020 at 4:08:59 PM UTC-4, Phil Hobbs wrote:
On 2020-03-31 14:40, Rick C wrote:
On Tuesday, March 31, 2020 at 12:41:36 PM UTC-4, David Brown wrote:
On 31/03/2020 17:40, blocher@columbus.rr.com wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Also - the FPGA guys and the SW guys will only acknowledge a problem when it is laid out under their nose. It is never their fault :)


That's because it's usually a hardware fault - and it can be solved by
using a bigger capacitor :)

You laugh, I once used a telephony part that had a PSRR of 0dB which I had missed. (Who expects 0 dB?) On the customer's work bench they were getting noise in the audio that turned out to be from the DSP power consumption. They were using clip leads to provide power to the UUT and the on board capacitance wasn't enough to mitigate it. We told them to use better power connections and also used a larger cap.

0 dB of PSRR??? How can you even do that exactly??? CP Clare, what a piece of work they are. The other CP Clare part had a problem that virtually made it unusable, but they didn't point it out in the data sheet. I wonder if they actually use engineers or if they just let high school kids design their ICs?

Are you quoting that WRT the input or the output? PSRR and CMRR are
normally quoted input-referred, i.e. to find out the effect you have to
multiply by the overall gain.

There are lots of parts that can have negative-dB PSRR as referred to
the output.

This was a telephone line isolation interface. One end was connected to the phone line, an isolation capacitor (high frequency chopper) crossed the isolation barrier and the other side of the chip connected to the low voltage CODEC circuit.

Not sure it matters if the spec was input or output referred since the circuit has no gain, just isolation.

We had some low level audio frequency noise on the power rail (10 mV comes to mind) which showed up in the data as an audible tone which corresponded to the processing loop of the DSP. 10 mV seems like an acceptable amount of noise in a power supply line, but I suppose normally PS noise is outside the audible range. The noise wasn't loud, but present. The fact that it came and went was what make it noticeable.

Compare to op amps where I typically see a large amount of PSRR in the audio range, some 50 dB and up. The impact of 10 mV audio noise would not be measurable in most op amp circuits.

--

Rick C.

-+ Get 1,000 miles of free Supercharging
-+ Tesla referral code - https://ts.la/richard11209
 
On Tuesday, March 31, 2020 at 4:08:59 PM UTC-4, Phil Hobbs wrote:
On 2020-03-31 14:40, Rick C wrote:
On Tuesday, March 31, 2020 at 12:41:36 PM UTC-4, David Brown wrote:
On 31/03/2020 17:40, blocher@columbus.rr.com wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Also - the FPGA guys and the SW guys will only acknowledge a problem when it is laid out under their nose. It is never their fault :)


That's because it's usually a hardware fault - and it can be solved by
using a bigger capacitor :)

You laugh, I once used a telephony part that had a PSRR of 0dB which I had missed. (Who expects 0 dB?) On the customer's work bench they were getting noise in the audio that turned out to be from the DSP power consumption. They were using clip leads to provide power to the UUT and the on board capacitance wasn't enough to mitigate it. We told them to use better power connections and also used a larger cap.

0 dB of PSRR??? How can you even do that exactly??? CP Clare, what a piece of work they are. The other CP Clare part had a problem that virtually made it unusable, but they didn't point it out in the data sheet. I wonder if they actually use engineers or if they just let high school kids design their ICs?

Are you quoting that WRT the input or the output? PSRR and CMRR are
normally quoted input-referred, i.e. to find out the effect you have to
multiply by the overall gain.

There are lots of parts that can have negative-dB PSRR as referred to
the output.
At higher frequencies aren't there many opamps that cross
0 dB PSRR. At least for one of the rails.
(That's why God* invented the cap. multiplier.)

George H.
*or one of his offspring.... who did do the cap mult. first?

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com
 
On Tue, 31 Mar 2020 13:35:57 -0700 (PDT), George Herold
<ggherold@gmail.com> wrote:

On Tuesday, March 31, 2020 at 3:04:56 PM UTC-4, John Larkin wrote:
On Tue, 31 Mar 2020 14:55:02 -0400, ABLE1 <somewhere@nowhere.net
wrote:

On 3/31/2020 11:34 AM, blocher@columbus.rr.com wrote:

Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Whoops!! 2nd try!!

With all the above being typed and read I have a much simpler way
to look the problem.

Just use the the "Not Method of Troubleshooting".

The Not Method goes like this.

It's Not this!!
It's Not that!!
Once you have identified all the Not's, the only thing left
is not a not, but is the real problem.
Fix it or replace and move on!!

Now I am sure someone will find fault with my method, well Ok then!!
Some days the Not's just have to be adjusted.

Have a good day!!

Les

That's the Sherlock Holmes technique. It doesn't work very well. The
list of NOTs to test is too big, and you are unlikely to include in
the list the things you missed when you did the design.

The NOT technique is a last resort. (it's how I found a
leaky toggle switch... we had a bag of leaky
switches, most circuits didn't care if there
is a few meg ohm of resistance.) Before you pull all
your hair out, you pull all the components out and replace 'em.
But how do you know the replacement component is good!
Quickly a knotty nightmare.

I find it best to get as much data as possible,
and then sleep on it*. When you think about it 'actively'
you tend to get stuck in your first assumption rut.
(And if your first assumption had been right, it'd be
fixed/found already. :^)

George H.
*or go explain the problem to someone else... not that they will
be able to help (well they might) but because having to explain it
makes you go over the whole circuit and may remind you of the part
you haven't been thinking about.

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

I say to myself, "This was designed by an idiot. What stupid mistake
did he make?"

There is a tendency to blame parts, when the problem is usually
design.




--

John Larkin Highland Technology, Inc

Science teaches us to doubt.

Claude Bernard
 
blo...@columbus.rr.com wrote:

-------------------------------
Another topic that I hope can elicit engineering discussion:

** IOW another mindless troll.

What makes up a good skill set for finding the root cause of a
failure that is rare, intermittent or obscure?

** Analyse the actual failure first.

Something good service techs do every day, but few designers have a clue about.

Your dopey rules are all context free generalizations, so totally meaningless.




....... Phil
 
On 1/4/20 2:34 am, blocher@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Identify the earliest necessary or possible precursor to the failure
symptom, and search for that instead.

CH
 
On Wednesday, April 1, 2020 at 10:34:08 AM UTC+11, jla...@highlandsniptechnology.com wrote:
On Tue, 31 Mar 2020 13:35:57 -0700 (PDT), George Herold
ggherold@gmail.com> wrote:

On Tuesday, March 31, 2020 at 3:04:56 PM UTC-4, John Larkin wrote:
On Tue, 31 Mar 2020 14:55:02 -0400, ABLE1 <somewhere@nowhere.net
wrote:

On 3/31/2020 11:34 AM, blocher@columbus.rr.com wrote:

<snip>

*or go explain the problem to someone else... not that they will
be able to help (well they might) but because having to explain it
makes you go over the whole circuit and may remind you of the part
you haven't been thinking about.

I say to myself, "This was designed by an idiot. What stupid mistake
did he make?"

This isn't actually useful. There are plenty of idiots around, so it can be productive, but there are a lot of circuits where you have to look quite hard to see what the designer was worried about and work out what some ostensibly strange feature was intended to deal with.

Writing something off to idiocy prematurely could leave you with egg all over your face.

There is a tendency to blame parts, when the problem is usually
design.

Or the lack of it.

in one circuit I was working on there was an op amp that clearly should have been oscillating, but it's output was loaded with a 100nF ceramic capacitor.

When I looked hard, it was actually oscillating, but at a very low amplitude.

When I put in a more classical solution the oscillation went away and the DC offset at the amplifier went down.

The machine involved made about 95% of the single crystal GaAs manufactured in the free world at that time, and it may have had a little less thermal stress if made in machines that had been retrofitted with my version of that circuit.

--
Bill Sloman, Sydney
 
On Tuesday, March 31, 2020 at 7:02:43 PM UTC-4, Phil Allison wrote:
blo...@columbus.rr.com wrote:

-------------------------------
Another topic that I hope can elicit engineering discussion:


** IOW another mindless troll.

Whew, pot calling the kettle black!

--

Rick C.

-+ Get 1,000 miles of free Supercharging
-+ Tesla referral code - https://ts.la/richard11209
 
On Wednesday, April 1, 2020 at 4:59:07 AM UTC+11, Tom Gardner wrote:
On 31/03/20 18:17, George Herold wrote:
Hmm OK. I designate two types of problem solving.

1.) Your (prototype) gizmo is not working.
I call this de-bugging. The problem could be somewhere in the
gizmo, or you may have made a fundamental error in your idea.
Those are the hardest types of problems.

2.) You've got several working units but this one from production
has a problem not seen before.
I call that trouble shooting... it's easier because you've got working
units, so you know it can't be a fundamental problem.
It could still be a design problem. Like you didn't spec the spread in
cap ESR on the voltage regulator and the odd high or low esr cap causes
your voltage regulator to oscillate.

Add 3) It fails on some customers' site, but not elsewhere.

Now, is it because the customers' equipment is at fault or
the spec is inadequate (whatever that might mean)?

The Cambridge Instruments example of that problem was an electron beam microfabricator which wrote patterns fine, but sometimes bits of the pattern were half a micron away from where they ought to be.

We had to fly our chief engineer to America to sort it out - he was a brilliant diagnostician, but what turned out to be crucial was that he was an engineering history buff. When he got into the lift to go up to see the machine he said "This is a hydraulic lift" which it was.

The large lump of wrought iron that served as the piston pushed up by water pressure was magnetic, and when the lift was up the ambient magnetic field at the electron beam microfabricator changed enough to move the electron beam half a micron.

It became a legendary "super suss". The lab had to stop the lift from being used while a pattern was being written, which wasn't much of an inconvenience, and a lot cheaper than any other possible cure.

--
Bill Sloman, Sydney
 

Welcome to EDABoard.com

Sponsor

Back
Top