Approach to Finding the Root Cause of Failures

Guest
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......
 
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad.. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Also - the FPGA guys and the SW guys will only acknowledge a problem when it is laid out under their nose. It is never their fault :)
 
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad.. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Oh yeah....If it is RF related there is >50% change it is grounding related
 
On Tuesday, March 31, 2020 at 11:48:11 AM UTC-4, George Herold wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Yeah I'd call this trouble shooting.

This is talking about problems that are deeper than troubleshooting. Isolating Broken parts is troubleshooting. This is finding the hidden corner cases in a design that typically are not seen until hundreds of units are in the field finding those corner cases

The most important thing IMHO
is not to make assumptions about the cause early on. This is hard
because we all look for an 'answer' first and then try and test it.
So get as much data on problem as you can. Then make a list of
all possible things it might be. And a list of possible tests.
(Then go to sleep or do something else and maybe some other ideas
will form in your brain.)

Finding intermittent problems is the worst. And it's sometimes
useful trying to make it fail more often.

George H.
 
On Tuesday, March 31, 2020 at 11:48:11 AM UTC-4, George Herold wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Yeah I'd call this trouble shooting. The most important thing IMHO
is not to make assumptions about the cause early on. This is hard
because we all look for an 'answer' first and then try and test it.
So get as much data on problem as you can. Then make a list of
all possible things it might be. And a list of possible tests.
(Then go to sleep or do something else and maybe some other ideas
will form in your brain.)

Finding intermittent problems is the worst. And it's sometimes
useful trying to make it fail more often.

I missed that one.... working hard to replicate the failure requires more energy than everything else, because in the end if you cannot replicate it you probably (exceptions to every rule) do not know for sure what it is
George H.
 
On Tue, 31 Mar 2020 08:34:34 -0700 (PDT), blocher@columbus.rr.com
wrote:

Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

That's a good one. Don't dismiss a weird observation just because it
goes away. People are emotionally primed to do that.



--

John Larkin Highland Technology, Inc

Science teaches us to doubt.

Claude Bernard
 
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad.. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Yeah I'd call this trouble shooting. The most important thing IMHO
is not to make assumptions about the cause early on. This is hard
because we all look for an 'answer' first and then try and test it.
So get as much data on problem as you can. Then make a list of
all possible things it might be. And a list of possible tests.
(Then go to sleep or do something else and maybe some other ideas
will form in your brain.)

Finding intermittent problems is the worst. And it's sometimes
useful trying to make it fail more often.

George H.
 
On 31/03/2020 17:40, blocher@columbus.rr.com wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Also - the FPGA guys and the SW guys will only acknowledge a problem when it is laid out under their nose. It is never their fault :)

That's because it's usually a hardware fault - and it can be solved by
using a bigger capacitor :)
 
On Tuesday, March 31, 2020 at 12:10:18 PM UTC-4, blo...@columbus.rr.com wrote:
On Tuesday, March 31, 2020 at 11:48:11 AM UTC-4, George Herold wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Yeah I'd call this trouble shooting.

This is talking about problems that are deeper than troubleshooting. Isolating Broken parts is troubleshooting. This is finding the hidden corner cases in a design that typically are not seen until hundreds of units are in the field finding those corner cases

Hmm OK. I designate two types of problem solving.

1.) Your (prototype) gizmo is not working.
I call this de-bugging. The problem could be somewhere in the
gizmo, or you may have made a fundamental error in your idea.
Those are the hardest types of problems.

2.) You've got several working units but this one from production
has a problem not seen before.
I call that trouble shooting... it's easier because you've got working
units, so you know it can't be a fundamental problem.
It could still be a design problem. Like you didn't spec the spread in
cap ESR on the voltage regulator and the odd high or low esr cap causes
your voltage regulator to oscillate.

Trouble shooting is by far where I've spent most of my
'problem solving' time.

I guess there is some intermediate case, where your prototype
gizmo is (mostly) working, but there's a glitch or something not
understood on the edge cases. You could call that trouble shooting
or de-bugging.

I use to work for a small company, not too many units made per year.
And would half joke that our customers were our beta testers.

I'm not sure having a 'simple' broken component makes things any easier.
I remember ripping up this whole circuit piece by piece, to finally discover
that a toggle switch had ~1 meg ohm of resistance when open.
(it drove me crazy for a few days.)

George H.

The most important thing IMHO
is not to make assumptions about the cause early on. This is hard
because we all look for an 'answer' first and then try and test it.
So get as much data on problem as you can. Then make a list of
all possible things it might be. And a list of possible tests.
(Then go to sleep or do something else and maybe some other ideas
will form in your brain.)

Finding intermittent problems is the worst. And it's sometimes
useful trying to make it fail more often.

George H.
 
On 2020-03-31 11:48, George Herold wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad.. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Yeah I'd call this trouble shooting. The most important thing IMHO
is not to make assumptions about the cause early on. This is hard
because we all look for an 'answer' first and then try and test it.
So get as much data on problem as you can. Then make a list of
all possible things it might be. And a list of possible tests.
(Then go to sleep or do something else and maybe some other ideas
will form in your brain.)

Finding intermittent problems is the worst. And it's sometimes
useful trying to make it fail more often.

Very. Making it worse is as good as making it better.

Also, as you're going through it, fix every problem that you find.
Surprisingly often that'll also fix the mysterious one. Long ago, I had
a sensitive front end which had a horrible offset voltage problem.

It was a low cost optical head tracker for computers, and used modulated
IR LEDs to illuminate your forehead, and three pairs of photodiodes
positioned behind a shadow mask to get XYZ position of the bright patch.
The PDs were chopped at 100 kHz, and each channel had an MC1496 to do
the synchronous detection. One channel had a horrible offset voltage
problem.

Everything I did seemed to make it worse. Turned out to be the 100 kHz
getting in from the noisy supply via a 1-pole cap multiplier ripple (180
degrees lag from two poles) and stray capacitance to the noisy supply
(90 degrees lag from the filter - 90 degrees lead from the stray
capacitance). Both contributions were in phase with the LO, and just
about exactly the same size. Fixing the supply ripple revealed just how
bad the stray contribution was: I had a single 1-mm pad over a
slightly-noisy supply pour. A BFC fixed both.

One more: problems never "just go away". Even if it's a rare EMI
condition, like Joerg's example of the radar EMI in the other thread,
the EMI vulnerability didn't go away when they closed the aluminum blinds.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com
 
On Tuesday, March 31, 2020 at 1:11:24 PM UTC-4, Phil Hobbs wrote:
On 2020-03-31 11:48, George Herold wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad.. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Yeah I'd call this trouble shooting. The most important thing IMHO
is not to make assumptions about the cause early on. This is hard
because we all look for an 'answer' first and then try and test it.
So get as much data on problem as you can. Then make a list of
all possible things it might be. And a list of possible tests.
(Then go to sleep or do something else and maybe some other ideas
will form in your brain.)

Finding intermittent problems is the worst. And it's sometimes
useful trying to make it fail more often.

Very. Making it worse is as good as making it better.

Also, as you're going through it, fix every problem that you find.
Surprisingly often that'll also fix the mysterious one. Long ago, I had
a sensitive front end which had a horrible offset voltage problem.

It was a low cost optical head tracker for computers, and used modulated
IR LEDs to illuminate your forehead, and three pairs of photodiodes
positioned behind a shadow mask to get XYZ position of the bright patch.
The PDs were chopped at 100 kHz, and each channel had an MC1496 to do
the synchronous detection. One channel had a horrible offset voltage
problem.

Everything I did seemed to make it worse. Turned out to be the 100 kHz
getting in from the noisy supply via a 1-pole cap multiplier ripple (180
degrees lag from two poles) and stray capacitance to the noisy supply
(90 degrees lag from the filter - 90 degrees lead from the stray
capacitance). Both contributions were in phase with the LO, and just
about exactly the same size. Fixing the supply ripple revealed just how
bad the stray contribution was: I had a single 1-mm pad over a
slightly-noisy supply pour. A BFC fixed both.

Oh two 'layers' of the not working 'onion' have about the same
magnitude but opposite signs... that insidious.

One more: problems never "just go away". Even if it's a rare EMI
condition, like Joerg's example of the radar EMI in the other thread,
the EMI vulnerability didn't go away when they closed the aluminum blinds..
Well unless the sample is changing with time.
I had these new Rb cells from a reputable supplier. Dang things had
some signs of residual gas in them. I made some guesstimate of the amount
of gas (Ramsey, "Molecular Beams"). The supplier had some test where by
he could tell the gas was below some level. We went back and forth for a few
days, and finally we agreed I'd send one back for him to test again.
A week or so later I got the cell back from him.. having checked out fine
the second time. When I tested it again (after the few weeks) it was fine.
And the other 9 cells (that I'd kept) were also all fine now.

George H.
Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com
 
On Tuesday, March 31, 2020 at 11:40:42 AM UTC-4, blo...@columbus.rr.com wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Also - the FPGA guys and the SW guys will only acknowledge a problem when it is laid out under their nose. It is never their fault :)

Sure, why should they waste time chasing hardware problems which they can't duplicate in their simulations? It's hard to prove a problem *isn't* in the SW, so they expect to see proof that it *is* in the SW. It's the only rational way to handle it.

I recall once spending days adding debug features to an FPGA and the ah-ha moment in the lab when I said, "It's almost as if it isn't being initialized". Sure enough, that was the problem. I didn't have to spend as much time in the lab after that.

--

Rick C.

- Get 1,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209
 
On Tuesday, March 31, 2020 at 11:48:11 AM UTC-4, George Herold wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Yeah I'd call this trouble shooting. The most important thing IMHO
is not to make assumptions about the cause early on. This is hard
because we all look for an 'answer' first and then try and test it.
So get as much data on problem as you can. Then make a list of
all possible things it might be. And a list of possible tests.
(Then go to sleep or do something else and maybe some other ideas
will form in your brain.)

Finding intermittent problems is the worst. And it's sometimes
useful trying to make it fail more often.

Yeah, someone told me the failure rate for telephone equipment is some hugely large number that if you get a failure at that level is near impossible to find. This requires a whole different mindset to the design and test process. Essentially you have to prove that every part of your design works rather than testing for a failure.

Kinda like medical equipment.

--

Rick C.

+ Get 1,000 miles of free Supercharging
+ Tesla referral code - https://ts.la/richard11209
 
On Tue, 31 Mar 2020 08:40:34 -0700 (PDT), blocher@columbus.rr.com
wrote:

On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Also - the FPGA guys and the SW guys will only acknowledge a problem when it is laid out under their nose. It is never their fault :)

FPGA people bench test pretty hard, so want serious explanations of
why things went wrong... which they seldom do. Programmers seem to
accept that there will be bugs.

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com
 
On Tuesday, March 31, 2020 at 12:41:36 PM UTC-4, David Brown wrote:
On 31/03/2020 17:40, blocher@columbus.rr.com wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Also - the FPGA guys and the SW guys will only acknowledge a problem when it is laid out under their nose. It is never their fault :)


That's because it's usually a hardware fault - and it can be solved by
using a bigger capacitor :)

You laugh, I once used a telephony part that had a PSRR of 0dB which I had missed. (Who expects 0 dB?) On the customer's work bench they were getting noise in the audio that turned out to be from the DSP power consumption. They were using clip leads to provide power to the UUT and the on board capacitance wasn't enough to mitigate it. We told them to use better power connections and also used a larger cap.

0 dB of PSRR??? How can you even do that exactly??? CP Clare, what a piece of work they are. The other CP Clare part had a problem that virtually made it unusable, but they didn't point it out in the data sheet. I wonder if they actually use engineers or if they just let high school kids design their ICs?

This was my first project as an independent engineer and I never forgot the lessons I learned on that. The other big ones were to not do your own procurement and NEVER trust a disti delivery date.

--

Rick C.

-- Get 1,000 miles of free Supercharging
-- Tesla referral code - https://ts.la/richard11209
 
On Tuesday, March 31, 2020 at 8:34:44 AM UTC-7, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

It's not about attitude, really, but about the PARTS that compose the
problematic item.

You really want an analysis, a breakdown of all of the elements of the
apparatus. Have you ever considered the internal mechanical construction of the
batteries? Loose connections can be internal to a dry cell. Or,
thermal sensitivity of wiring (because of thermocouple effects)?
So, pretend you have X-ray vision, and consider all the parts, even if YOU didn't
handle them except as subassemblies. Importance can attach to the
plating on a washer, or a choice of glue, or an historic supply-chain shift.

It might be a contaminant in the chemistry of a 'pure' material. The tale
is told of a failure of carrier lifetime at Fairchild, which was traced to the
introduction of Lemon-Fresh Joy detergent.
 
On 31/03/20 18:17, George Herold wrote:
Hmm OK. I designate two types of problem solving.

1.) Your (prototype) gizmo is not working.
I call this de-bugging. The problem could be somewhere in the
gizmo, or you may have made a fundamental error in your idea.
Those are the hardest types of problems.

2.) You've got several working units but this one from production
has a problem not seen before.
I call that trouble shooting... it's easier because you've got working
units, so you know it can't be a fundamental problem.
It could still be a design problem. Like you didn't spec the spread in
cap ESR on the voltage regulator and the odd high or low esr cap causes
your voltage regulator to oscillate.

Add 3) It fails on some customers' site, but not elsewhere.

Now, is it because the customers' equipment is at fault or
the spec is inadequate (whatever that might mean)?
 
On Tuesday, March 31, 2020 at 1:59:07 PM UTC-4, Tom Gardner wrote:
On 31/03/20 18:17, George Herold wrote:
Hmm OK. I designate two types of problem solving.

1.) Your (prototype) gizmo is not working.
I call this de-bugging. The problem could be somewhere in the
gizmo, or you may have made a fundamental error in your idea.
Those are the hardest types of problems.

2.) You've got several working units but this one from production
has a problem not seen before.
I call that trouble shooting... it's easier because you've got working
units, so you know it can't be a fundamental problem.
It could still be a design problem. Like you didn't spec the spread in
cap ESR on the voltage regulator and the odd high or low esr cap causes
your voltage regulator to oscillate.

Add 3) It fails on some customers' site, but not elsewhere.

Now, is it because the customers' equipment is at fault or
the spec is inadequate (whatever that might mean)?

Yeah, I'd still call that trouble shooting 'cause you know it works
most places.
Dealing with customer problems is a whole 'nother ball of wax.
1.) they are customers
2.) they might be (experimental) idiots
3.) they might have a 'real' problem.

It's a delicate dance.
George H.
 
On Tue, 31 Mar 2020 14:55:02 -0400, ABLE1 <somewhere@nowhere.net>
wrote:

On 3/31/2020 11:34 AM, blocher@columbus.rr.com wrote:

Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Whoops!! 2nd try!!

With all the above being typed and read I have a much simpler way
to look the problem.

Just use the the "Not Method of Troubleshooting".

The Not Method goes like this.

It's Not this!!
It's Not that!!
Once you have identified all the Not's, the only thing left
is not a not, but is the real problem.
Fix it or replace and move on!!

Now I am sure someone will find fault with my method, well Ok then!!
Some days the Not's just have to be adjusted.

Have a good day!!

Les

That's the Sherlock Holmes technique. It doesn't work very well. The
list of NOTs to test is too big, and you are unlikely to include in
the list the things you missed when you did the design.

--

John Larkin Highland Technology, Inc
picosecond timing precision measurement

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com
 
On Tue, 31 Mar 2020 09:10:09 -0700 (PDT), blocher@columbus.rr.com
wrote:

On Tuesday, March 31, 2020 at 11:48:11 AM UTC-4, George Herold wrote:
On Tuesday, March 31, 2020 at 11:34:44 AM UTC-4, blo...@columbus.rr.com wrote:
Another topic that I hope can elicit engineering discussion:

What makes up a good skill set for finding the root cause of a failure that is rare, intermittent or obscure?

Over the past several years I have been more involved in root cause failure than I was when I was doing more design work. In many ways I think it is more challenging than design work. It takes a mindset that is different than design.

Here is my reminder list when doing root cause studies

1. never root for a particular outcome when performing a test. Root for not being fooled by the results of your test

Definitely can be fooled by the test itself. In my case a lot of
times it has to do with common mode noise getting into the
mearurements.



2. Assign weighting factors to everything you believe. Never assign a weighting factor of 1 to anything until you know you have the problem solved

3. Expect to have to do certain tests over again and that you will draw an opposite conclusion when you repeat a test than what you concluded after the first test.

4. Taking guidance from "helpful" outsiders is challenging. On the one had they care and are smart, on the other hand if you go about chasing other peoples ideas (often conceived of to just demonstrate they are concerned in a meeting) you will never get an a clear path to troubleshoot the problem in your own way.
Help is a two edged sword. It is important but can sometimes be problematic.

5. As an aside - I have learned that when I "see something" during the design phase, I no longer look at that as a curse, but as a blessing. It is going to come back and get you later.

6. Get past the notion that having nothing to show for a days work is bad. As a designer you can show a days work for a days pay. In root cause you feel like you have accomplished nothing for a long time. Frequently, though , these problems are the most visible problems in an organization and can make a difference between losing a customer and keeping one.

7. Look for contradictions in your thinking. Use other people to help you find contradictions in your thinking.

OK - enough for now......

Yeah I'd call this trouble shooting.

This is talking about problems that are deeper than troubleshooting. Isolating Broken parts is troubleshooting. This is finding the hidden corner cases in a design that typically are not seen until hundreds of units are in the field finding those corner cases

Yes, I would call it a step past troubleshooting but troubleshooting
is a big part of it.

The phrase I like is that you have to "understand the problem"

That I normally do in the engineering phase but it's usually some
error in the original engineering. Something designed on the edge
and not worse case or even the documentation lacking enough
information.


boB


The most important thing IMHO
is not to make assumptions about the cause early on. This is hard
because we all look for an 'answer' first and then try and test it.
So get as much data on problem as you can. Then make a list of
all possible things it might be. And a list of possible tests.
(Then go to sleep or do something else and maybe some other ideas
will form in your brain.)

Finding intermittent problems is the worst. And it's sometimes
useful trying to make it fail more often.

George H.
 

Welcome to EDABoard.com

Sponsor

Back
Top