design rigor: electronics vs. software

Jan 13, 2020

On Sunday, January 12, 2020 at 5:55:08 PM UTC-5, Phil Hobbs wrote:

On 2020-01-12 17:38, jjhudak4@gmail.com wrote:
On Sunday, January 12, 2020 at 3:32:06 PM UTC-5,
DecadentLinux...@decadence..org wrote:
Phil Hobbs <pcdhSpamMeSenseless@electrooptical.net> wrote in
news:fb4888b5-e96f-1145-85e8-bc382c9bdcdf@electrooptical.net:

Back in my one foray into big-system design, we design engineers
were always getting in the systems guys' faces about various
pieces of stupidity in the specs. It was all pretty
good-natured, and we wound up with the pain and suffering
distributed about equally.

That is how men get work done... even 'the programmers'. Very
well said, there.

That is like the old dig on 'the hourly help'.

Some programmers are very smart. Others not so much.

I guess choosing to go into it is not such a smart move so they
take a hit from the start.

If that is how men get work done then they are not using software
and system engineering techniques developed in the last 15-20 years
and their results are *still* subject to the same types of errors. I
do research and teach in this area. A number of studies, and one in
particular, cites up to 70% of software faults are introduced on
the LHS of the 'V' development model (Other software design lifecycle
models have similar fault percentages.) A major issue is that most
of these errors are observed at integration time
(software+software, software+hardware). The cost of defect removal
along the RHS of the 'V' development model is anywhere from 50-200X
of the removal cost along the LHS of the 'V'. (no wonder why systems
cost so much)

Nice rant. Could you tell us more about the 'V' model?

The talk about errors in this thread are very high level and most
ppl have the mindset that they are thinking about errors at the unit
test level. There are numerous techniques developed to identify and
fix fault types throughout the entire development lifecycle but
regrettably a lot of them are not employed.

What sorts of techniques to you use to find problems in the specifications?
Actually a large percentage of the errors are discovered and fixed at
that level. Errors of the type: units mismatch, variable type
mismatch, and a slew of concurrency issues aren't discovered till
integration time. Usually, at that point, there is a 'rush' to get
the system fielded. The horror stories and lessons learned are well
documented.

Yup. Leaving too much stuff for the system integration step is a very
very well-known way to fail.

IDK what exactly happened (yet) with the Boeing MAX development. I
do have info from some sources that cannot be disclosed at this
point. From what I've read, there were major mistakes made from
inception through implementation and integration. My personal view,
is that one should almost never (never?) place the task on software
to correct an inherently unstable airframe design - it is putting a
bandaid on the source of the problem.

It's commonly done, though, isn't it? I remember reading Ben Rich's
book on the Skunk Works, where he says that the F-117's very squirrelly
handling characteristics were fixed up in software to make it a
beautiful plane to fly. That was about 1980.

Another major issue is the hazard analysis and fault tolerance
approach was not done at the system (the redundancy approach was
pitiful, as well as the *logic* used in implementing it as well as
conceptual.

I do think that the better software engineers do have a more
holistic view of the system (hardware knowledge + system operational
knowledge) which will allow them to ask questions when things don't
'seem right.' OTHO, the software engineers should not go making
assumptions about things and coding to those assumptions. (It
happens more than you think) It is the job of the software architect
to ensure that any development assumptions are captured and specified
in the software architecture.

In real life, though, it's super important to have two-way
communications during development, no? My large-system experience was
all hardware (the first civilian satellite DBS system, 1981-83), so
things were quite a bit simpler than in a large software-intensive
system. I'd expect the need for bottom-up communication to be greater
now rather than less.

In studies I have looked at, the percentage of requirements errors
is somewhere between 30-40% of the overall number of faults during
the design lifecycle, and the 'industry standard' approach approach
to dealing with this problem is woefully indequate despite techniques
to detect and remove the errors. A LOT Of time is spent doing
software requirements tracing as opposed to doing verification of
requirements. People argue that one cannot verify the requirements
until the system has been built - which is complete BS but industry
is very slow to change. We have shown that using software
architecture modeling addresses a large percentage of system level
problems early in the design life cycle. We are trying to convince
industry. Until change happens, the parade of failures like the
MAX will continue.

I'd love to hear more about that.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com

Sorry - I get a bit carried away on this topic...
For requirements engineering verification one can google: formal and semi-formal requirements specification languages. RDAL and ReqSpec are ones I am familiar with.
Techniques to verify requirements include model checking. Google model checking. Based of formal logic like LTL (Linear temporal logic) CTL (Compositional Tree Logic. One constructs state models from requirements and uses model checking engines to analyze the structures. Model checking was actually used to verify a bus protocol in the early 90s and found *lots* of problems with the spec...that caused industry to 'wake up'.
There are others that work on code, but these are very much research-y efforts.

Simulink has a model checker in its toolboxes (based on Promala) it is quite good).

We advocate using architecture design languages (ADL's) that is a formal modeling notation to model different views of the architecture and capture properties of the system from which analysis can be done (e.g. signal latency, variable format and property consistency, processor utilization, bandwidth capacity, hazard analysis, etc.) The one that I had a hand in designing is Architecture Analysis and Design Language (AADL) It is an SAE Aerospace standard. IF things turn out well, it will be used on the next generation of helecopters for the army. We have been piloting it use on real systems for the last 2-3 years, and last 10 years on pilot studies.
For systems hazard analysis, google STPA (System Theoretic Process Approach) spearheaded by Nancy Leveson MIT (She has consulted to Boeing).

Yes, I've seen software applied to fix hw problems but assessing the risk is complicated. The results can be catastrophic.
Ok, off my rant....

Jan 13, 2020

On Sunday, January 12, 2020 at 5:55:08 PM UTC-5, Phil Hobbs wrote:

On 2020-01-12 17:38, jjhudak4@gmail.com wrote:
On Sunday, January 12, 2020 at 3:32:06 PM UTC-5,
DecadentLinux...@decadence..org wrote:
Phil Hobbs <pcdhSpamMeSenseless@electrooptical.net> wrote in
news:fb4888b5-e96f-1145-85e8-bc382c9bdcdf@electrooptical.net:

Back in my one foray into big-system design, we design engineers
were always getting in the systems guys' faces about various
pieces of stupidity in the specs. It was all pretty
good-natured, and we wound up with the pain and suffering
distributed about equally.

That is how men get work done... even 'the programmers'. Very
well said, there.

That is like the old dig on 'the hourly help'.

Some programmers are very smart. Others not so much.

I guess choosing to go into it is not such a smart move so they
take a hit from the start.

If that is how men get work done then they are not using software
and system engineering techniques developed in the last 15-20 years
and their results are *still* subject to the same types of errors. I
do research and teach in this area. A number of studies, and one in
particular, cites up to 70% of software faults are introduced on
the LHS of the 'V' development model (Other software design lifecycle
models have similar fault percentages.) A major issue is that most
of these errors are observed at integration time
(software+software, software+hardware). The cost of defect removal
along the RHS of the 'V' development model is anywhere from 50-200X
of the removal cost along the LHS of the 'V'. (no wonder why systems
cost so much)

Nice rant. Could you tell us more about the 'V' model?

The talk about errors in this thread are very high level and most
ppl have the mindset that they are thinking about errors at the unit
test level. There are numerous techniques developed to identify and
fix fault types throughout the entire development lifecycle but
regrettably a lot of them are not employed.

What sorts of techniques to you use to find problems in the specifications?
Actually a large percentage of the errors are discovered and fixed at
that level. Errors of the type: units mismatch, variable type
mismatch, and a slew of concurrency issues aren't discovered till
integration time. Usually, at that point, there is a 'rush' to get
the system fielded. The horror stories and lessons learned are well
documented.

Yup. Leaving too much stuff for the system integration step is a very
very well-known way to fail.

IDK what exactly happened (yet) with the Boeing MAX development. I
do have info from some sources that cannot be disclosed at this
point. From what I've read, there were major mistakes made from
inception through implementation and integration. My personal view,
is that one should almost never (never?) place the task on software
to correct an inherently unstable airframe design - it is putting a
bandaid on the source of the problem.

It's commonly done, though, isn't it? I remember reading Ben Rich's
book on the Skunk Works, where he says that the F-117's very squirrelly
handling characteristics were fixed up in software to make it a
beautiful plane to fly. That was about 1980.

Another major issue is the hazard analysis and fault tolerance
approach was not done at the system (the redundancy approach was
pitiful, as well as the *logic* used in implementing it as well as
conceptual.

I do think that the better software engineers do have a more
holistic view of the system (hardware knowledge + system operational
knowledge) which will allow them to ask questions when things don't
'seem right.' OTHO, the software engineers should not go making
assumptions about things and coding to those assumptions. (It
happens more than you think) It is the job of the software architect
to ensure that any development assumptions are captured and specified
in the software architecture.

In real life, though, it's super important to have two-way
communications during development, no? My large-system experience was
all hardware (the first civilian satellite DBS system, 1981-83), so
things were quite a bit simpler than in a large software-intensive
system. I'd expect the need for bottom-up communication to be greater
now rather than less.

In studies I have looked at, the percentage of requirements errors
is somewhere between 30-40% of the overall number of faults during
the design lifecycle, and the 'industry standard' approach approach
to dealing with this problem is woefully indequate despite techniques
to detect and remove the errors. A LOT Of time is spent doing
software requirements tracing as opposed to doing verification of
requirements. People argue that one cannot verify the requirements
until the system has been built - which is complete BS but industry
is very slow to change. We have shown that using software
architecture modeling addresses a large percentage of system level
problems early in the design life cycle. We are trying to convince
industry. Until change happens, the parade of failures like the
MAX will continue.

I'd love to hear more about that.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com

I forgot to add that the act of building a formal model in AADL from the requirements forces one to *think* about system wide impacts and do analysis on the architectural model.
Requirements are written in English, one of the most widely used tool is MSWord.
another is DOORs

Phil Hobbs · Jan 13, 2020

On 2020-01-12 19:13, jjhudak4@gmail.com wrote:

On Sunday, January 12, 2020 at 5:55:08 PM UTC-5, Phil Hobbs wrote:
On 2020-01-12 17:38, jjhudak4@gmail.com wrote:
On Sunday, January 12, 2020 at 3:32:06 PM UTC-5,
DecadentLinux...@decadence..org wrote:
Phil Hobbs <pcdhSpamMeSenseless@electrooptical.net> wrote in
news:fb4888b5-e96f-1145-85e8-bc382c9bdcdf@electrooptical.net:

Back in my one foray into big-system design, we design
engineers were always getting in the systems guys' faces
about various pieces of stupidity in the specs. It was all
pretty good-natured, and we wound up with the pain and
suffering distributed about equally.

That is how men get work done... even 'the programmers'. Very
well said, there.

That is like the old dig on 'the hourly help'.

Some programmers are very smart. Others not so much.

I guess choosing to go into it is not such a smart move so
they take a hit from the start.

If that is how men get work done then they are not using
software and system engineering techniques developed in the last
15-20 years and their results are *still* subject to the same
types of errors. I do research and teach in this area. A number
of studies, and one in particular, cites up to 70% of software
faults are introduced on the LHS of the 'V' development model
(Other software design lifecycle models have similar fault
percentages.) A major issue is that most of these errors are
observed at integration time (software+software,
software+hardware). The cost of defect removal along the RHS of
the 'V' development model is anywhere from 50-200X of the removal
cost along the LHS of the 'V'. (no wonder why systems cost so
much)

Nice rant. Could you tell us more about the 'V' model?

The talk about errors in this thread are very high level and
most ppl have the mindset that they are thinking about errors at
the unit test level. There are numerous techniques developed to
identify and fix fault types throughout the entire development
lifecycle but regrettably a lot of them are not employed.

What sorts of techniques to you use to find problems in the
specifications?
Actually a large percentage of the errors are discovered and
fixed at that level. Errors of the type: units mismatch, variable
type mismatch, and a slew of concurrency issues aren't discovered
till integration time. Usually, at that point, there is a 'rush'
to get the system fielded. The horror stories and lessons learned
are well documented.

Yup. Leaving too much stuff for the system integration step is a
very very well-known way to fail.

IDK what exactly happened (yet) with the Boeing MAX development.
I do have info from some sources that cannot be disclosed at
this point. From what I've read, there were major mistakes made
from inception through implementation and integration. My
personal view, is that one should almost never (never?) place the
task on software to correct an inherently unstable airframe
design - it is putting a bandaid on the source of the problem.

It's commonly done, though, isn't it? I remember reading Ben
Rich's book on the Skunk Works, where he says that the F-117's very
squirrelly handling characteristics were fixed up in software to
make it a beautiful plane to fly. That was about 1980.

Another major issue is the hazard analysis and fault tolerance
approach was not done at the system (the redundancy approach
was pitiful, as well as the *logic* used in implementing it as
well as conceptual.

I do think that the better software engineers do have a more
holistic view of the system (hardware knowledge + system
operational knowledge) which will allow them to ask questions
when things don't 'seem right.' OTHO, the software engineers
should not go making assumptions about things and coding to those
assumptions. (It happens more than you think) It is the job of
the software architect to ensure that any development assumptions
are captured and specified in the software architecture.

In real life, though, it's super important to have two-way
communications during development, no? My large-system experience
was all hardware (the first civilian satellite DBS system,
1981-83), so things were quite a bit simpler than in a large
software-intensive system. I'd expect the need for bottom-up
communication to be greater now rather than less.

In studies I have looked at, the percentage of requirements
errors is somewhere between 30-40% of the overall number of
faults during the design lifecycle, and the 'industry standard'
approach approach to dealing with this problem is woefully
indequate despite techniques to detect and remove the errors. A
LOT Of time is spent doing software requirements tracing as
opposed to doing verification of requirements. People argue that
one cannot verify the requirements until the system has been
built - which is complete BS but industry is very slow to change.
We have shown that using software architecture modeling addresses
a large percentage of system level problems early in the design
life cycle. We are trying to convince industry. Until change
happens, the parade of failures like the MAX will continue.

I'd love to hear more about that.

Cheers

Phil Hobbs

Sorry - I get a bit carried away on this topic... For requirements
engineering verification one can google: formal and semi-formal
requirements specification languages. RDAL and ReqSpec are ones I am
familiar with. Techniques to verify requirements include model
checking. Google model checking. Based of formal logic like LTL
(Linear temporal logic) CTL (Compositional Tree Logic. One constructs
state models from requirements and uses model checking engines to
analyze the structures. Model checking was actually used to verify a
bus protocol in the early 90s and found *lots* of problems with the
spec...that caused industry to 'wake up'. There are others that work
on code, but these are very much research-y efforts.

Simulink has a model checker in its toolboxes (based on Promala) it
is quite good).

We advocate using architecture design languages (ADL's) that is a
formal modeling notation to model different views of the architecture
and capture properties of the system from which analysis can be done
(e.g. signal latency, variable format and property consistency,
processor utilization, bandwidth capacity, hazard analysis, etc.)
The one that I had a hand in designing is Architecture Analysis and
Design Language (AADL) It is an SAE Aerospace standard. IF things
turn out well, it will be used on the next generation of helecopters
for the army. We have been piloting it use on real systems for the
last 2-3 years, and last 10 years on pilot studies. For systems
hazard analysis, google STPA (System Theoretic Process Approach)
spearheaded by Nancy Leveson MIT (She has consulted to Boeing).

Yes, I've seen software applied to fix hw problems but assessing the
risk is complicated. The results can be catastrophic. Ok, off my
rant....

Thanks. I feel a bit like I'm drinking from a fire hose, which is
always my preferred way of learning stuff.... I'd be super interested
in an accessible presentation of methods for sanity-checkin high-level
system requirements.

Being constitutionally lazy, I'm a huge fan of ways to work smarter
rather than harder.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com

John Larkin · Jan 13, 2020

On Sun, 12 Jan 2020 15:10:43 -0800 (PST), George Herold
<ggherold@gmail.com> wrote:

On Sunday, January 12, 2020 at 12:42:54 AM UTC-5, John Larkin wrote:
On 11 Jan 2020 17:50:07 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

That board is duck soup to lay out.

I dunno, a 176-pin PLCCC and a 256-pin BGA, plus
lots of other critical stuff, that's not so clear.

Anyway, I think John made his point.

There are four photodiode time stampers with 6 ps resolution, and
three delay generators with sub-ps resolution. There's a high-speed
SPI-like link to a control computer, and five more to energy
measurement boxes. Lots of controlled-impedance clocks and signals.
Wow! That sound like quite a box. Four inputs? What's the dead time
on a channel? More or less than $10k?

George H.

It's a controller for a deep-UV MOPA laser, for IC lithography. The
pulse rate is about 6 KHz. Way less than $10K.

--

John Larkin Highland Technology, Inc trk

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

John Larkin · Jan 13, 2020

On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
<'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

--

John Larkin Highland Technology, Inc trk

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

Tom Gardner · Jan 13, 2020

On 13/01/20 01:07, John Larkin wrote:

On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

Jan 13, 2020

Tom Gardner <spamjunk@blueyonder.co.uk> wrote in news

oWSF.30854
$Bf2.20780@fx39.am4:

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

I think it would be nice to have a full understanding of ANY
failure modes ANY transducer I would be programming the actions of
others from would have in its operation. So I would at least want to
be at those meetings. ;-)

So, in the 737 max scenario, I would want to know about it (atitude
sensor) sticking from icing up. As far as I know, the actual encoding
in them is a simple slot mask on a disc (optical encoder wheel),
which can resolve to a couple ticks per degree with ease, more if a
higher resolution were needed.

I would place two wheels on each and a 'kicker' device that turns
it through its full travel and then releases it for reading again
(and maybe a heater for the bearings). That way it could be checked
for failed/free operation while in flight.

RBlack · Jan 13, 2020

In article <d0nj1f50mabot5tnfooihn6o50up57n22b@4ax.com>,
jlarkin@highlandsniptechnology.com says...

[snip]

My Spice sims are often wrong initially, precisely because there are
basically no consequences to running the first try without much
checking. That is of course dangerous; we don't want to base a
hardware design on a sim that runs and makes pretty graphs but is
fundamentally wrong.

I just got bitten by a 'feature' of LTSpice XVII, I don't remeber IV
having this behaviour but I don't have it installed any more:

If you make a tweak to a previously working circuit, which makes the
netlister fail (in my case it was an inductor shorted to ground at both
ends), it will pop up a warning to this effect, and then *run the sim
using the old netlist*.

It will then allow you to probe around on the new schematic, but the
schematic nodes are mapped onto the old netlist, so depending on what
you tweaked, what is displayed can range from slightly wrong to flat-out
impossible.

Anyone else seen this?

Phil Hobbs · Jan 13, 2020

On 2020-01-13 04:04, Tom Gardner wrote:

On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Gee, Mr. Gardner, you're so manly--can I have your autograph?

Nobody's talking about coders doing jazz on the spec AFAICT. Systems
folks do need to listen to them, is all. If they can't do that because
they don't understand the issues, that's a serious organizational
problem, on a level with the flawed spec.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

Fortunately that's illegal over here, even for cause.

Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com

John Larkin · Jan 13, 2020

On Mon, 13 Jan 2020 09:04:20 +0000, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Job ending, not career ending. I wouldn't code something that was
obviously dumb and dangerous.

If someone quits Boeing over an issue like this, it doesn't end their
career. They can find a better employer.

If an interviewer asked "why did you leave Boeing?" I'd tell them.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

How can a company take away an earned pension? Because an engineer did
something ethical? Sounds like a giant settlement would follow;
quadruple that pension.

--

John Larkin Highland Technology, Inc trk

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

John Larkin · Jan 13, 2020

On Mon, 13 Jan 2020 09:04:20 +0000, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

https://philip.greenspun.com/blog/2019/03/21/optional-angle-of-attack-sensors-on-the-boeing-737-max/

Given dual sensors, why would any sane person decide to alternate
using one per flight?

A programmer would have to be awfully thick to not object to that.

--

John Larkin Highland Technology, Inc trk

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com

Tom Gardner · Jan 13, 2020

On 13/01/20 14:01, Phil Hobbs wrote:

On 2020-01-13 04:04, Tom Gardner wrote:
On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Gee, Mr. Gardner, you're so manly--can I have your autograph?

Nobody's talking about coders doing jazz on the spec AFAICT.Â Systems folks do
need to listen to them, is all.Â If they can't do that because they don't
understand the issues, that's a serious organizational problem, on a level with
the flawed spec.

Well, by all accounts there were/are serious organisational
problems in Boeing. Those are probably a significant
contributor to there being a flawed spec.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

Fortunately that's illegal over here, even for cause.

I was gobsmacked when I heard that, and don't understand it.
But then I don't even understand the concept of pension
"vesting".

Nonetheless, that's what his supervisor (who did his utmost
to save him) in Los Angeles said.

Tom Gardner · Jan 13, 2020

On 13/01/20 15:45, John Larkin wrote:

On Mon, 13 Jan 2020 09:04:20 +0000, Tom Gardner
spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Job ending, not career ending. I wouldn't code something that was
obviously dumb and dangerous.

If someone quits Boeing over an issue like this, it doesn't end their
career. They can find a better employer.

If an interviewer asked "why did you leave Boeing?" I'd tell them.

Agreed, but resigning is very different to deliberately
mis-implementing a spec.

In such circumstances I hope I would resign, but there have
been time in my life when that would have been impossible.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

How can a company take away an earned pension? Because an engineer did
something ethical? Sounds like a giant settlement would follow;
quadruple that pension.

I don't understand that either.

As I remember it, the conscientious worker was placed
under time pressure. Signoff required a signature and
his personal official stamp. He signed in advance of
completing the work, but did not affix his stamp.

A passing body saw that document, reported it, and
the process ground inexorably from that.

Grossly disproportionate an unfair? You betcha, but
so what.

Tom Gardner · Jan 13, 2020

On 13/01/20 15:58, John Larkin wrote:

On Mon, 13 Jan 2020 09:04:20 +0000, Tom Gardner
spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

https://philip.greenspun.com/blog/2019/03/21/optional-angle-of-attack-sensors-on-the-boeing-737-max/

Given dual sensors, why would any sane person decide to alternate
using one per flight?

Agreed. Especially given the poor reliability of AoA sensors.

The people that write and signed off that spec
bear a lot of responsibility

> A programmer would have to be awfully thick to not object to that.

The programmer's job is to implement the spec, not to write it

They may have objected, and may have been overruled.

Have you worked in large software organisations?

John Larkin · Jan 13, 2020

On Mon, 13 Jan 2020 16:40:55 +0000, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 15:58, John Larkin wrote:
On Mon, 13 Jan 2020 09:04:20 +0000, Tom Gardner
spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

https://philip.greenspun.com/blog/2019/03/21/optional-angle-of-attack-sensors-on-the-boeing-737-max/

Given dual sensors, why would any sane person decide to alternate
using one per flight?

Agreed. Especially given the poor reliability of AoA sensors.

The people that write and signed off that spec
bear a lot of responsibility

A programmer would have to be awfully thick to not object to that.

The programmer's job is to implement the spec, not to write it

They may have objected, and may have been overruled.

Have you worked in large software organisations?

Not in, but with. Most "just do our jobs", which means that they don't
care to learn much about the process that they are implementing.

And the hardware guys don't have much insight or visibility into the
software. Often, not much control either, in a large organization
where things are very firewalled.

Recipe for disaster.

--

John Larkin Highland Technology, Inc trk

The cork popped merrily, and Lord Peter rose to his feet.
"Bunter", he said, "I give you a toast. The triumph of Instinct over Reason"

Rick C · Jan 13, 2020

On Monday, January 13, 2020 at 11:35:30 AM UTC-5, Tom Gardner wrote:

On 13/01/20 15:45, John Larkin wrote:
On Mon, 13 Jan 2020 09:04:20 +0000, Tom Gardner
spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Job ending, not career ending. I wouldn't code something that was
obviously dumb and dangerous.

If someone quits Boeing over an issue like this, it doesn't end their
career. They can find a better employer.

If an interviewer asked "why did you leave Boeing?" I'd tell them.

Agreed, but resigning is very different to deliberately
mis-implementing a spec.

In such circumstances I hope I would resign, but there have
been time in my life when that would have been impossible.

Somewhat less significant, I was doing a bus timing analysis of an interface between a new board and existing boards in a new radio which was not yet in full production. I found a small timing spec miss with a Flash memory part. I tried to report it to the lead engineer but the response I got was "the unit has passed acceptance testing", as if that meant there were no errors in the radio.

I had been around and around with the company's hostile work environment and employees working around the system rather than doing what needed to be done. I let the matter drop. Not that it was particularly likely to cause a problem in the radio... but that was certainly a possibility, even if very, very small. These were military radios and everyone stressed how important it was that they work under all conditions. But the company had worn me down...

Companies suck. I won't be an employee again.

--

Rick C.

--+ Get 1,000 miles of free Supercharging
--+ Tesla referral code - https://ts.la/richard11209

John Larkin · Jan 13, 2020

On Sun, 12 Jan 2020 15:10:43 -0800 (PST), George Herold
<ggherold@gmail.com> wrote:

On Sunday, January 12, 2020 at 12:42:54 AM UTC-5, John Larkin wrote:
On 11 Jan 2020 17:50:07 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

That board is duck soup to lay out.

I dunno, a 176-pin PLCCC and a 256-pin BGA, plus
lots of other critical stuff, that's not so clear.

Anyway, I think John made his point.

There are four photodiode time stampers with 6 ps resolution, and
three delay generators with sub-ps resolution. There's a high-speed
SPI-like link to a control computer, and five more to energy
measurement boxes. Lots of controlled-impedance clocks and signals.
Wow! That sound like quite a box. Four inputs? What's the dead time
on a channel? More or less than $10k?

George H.

This is a controller for a MOPA deep-UV laser, for IC lithography. The
pules rate is about 6 KHz.

--

John Larkin Highland Technology, Inc trk

The cork popped merrily, and Lord Peter rose to his feet.
"Bunter", he said, "I give you a toast. The triumph of Instinct over Reason"

Rick C · Jan 13, 2020

On Monday, January 13, 2020 at 12:41:16 PM UTC-5, John Larkin wrote:

On Mon, 13 Jan 2020 16:40:55 +0000, Tom Gardner
spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 15:58, John Larkin wrote:
On Mon, 13 Jan 2020 09:04:20 +0000, Tom Gardner
spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

https://philip.greenspun.com/blog/2019/03/21/optional-angle-of-attack-sensors-on-the-boeing-737-max/

Given dual sensors, why would any sane person decide to alternate
using one per flight?

Agreed. Especially given the poor reliability of AoA sensors.

The people that write and signed off that spec
bear a lot of responsibility

A programmer would have to be awfully thick to not object to that.

The programmer's job is to implement the spec, not to write it

They may have objected, and may have been overruled.

Have you worked in large software organisations?

Not in, but with. Most "just do our jobs", which means that they don't
care to learn much about the process that they are implementing.

And the hardware guys don't have much insight or visibility into the
software. Often, not much control either, in a large organization
where things are very firewalled.

Recipe for disaster.

Not really an issue of firewalls. Your company does the same thing as we have pointed out. The 'Brat' doesn't look over your shoulder when you design the circuits, she just routes the board the way you tell her. That's not a firewall. That's delegation of responsibility. Same in larger companies.

Unlike many small companies, large ones have changed significantly over the decades. While many small companies are run by one or a small number of autocrats, large companies set u formal processes to make important decisions. The design process virtually always includes peer review. They try hard to not make mistakes, even small ones.

But there are many opportunities to make mistakes and we don't always avoid every one of them. Sometimes we push the boundaries and find our iceberg.

--

Rick C.

-+- Get 1,000 miles of free Supercharging
-+- Tesla referral code - https://ts.la/richard11209

Tom Gardner · Jan 13, 2020

On 13/01/20 17:41, John Larkin wrote:

On Mon, 13 Jan 2020 16:40:55 +0000, Tom Gardner
spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 15:58, John Larkin wrote:
On Mon, 13 Jan 2020 09:04:20 +0000, Tom Gardner
spamjunk@blueyonder.co.uk> wrote:

On 13/01/20 01:07, John Larkin wrote:
On Sun, 12 Jan 2020 16:58:40 +0000, Martin Brown
'''newspam'''@nezumi.demon.co.uk> wrote:

On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

The management of two AOA sensors was insane. Fatal, actually. A
programmer should understand simple stuff like that.

It is unrealistic to expect programmers to understand sensor
reliability. That is the job of the people specifying the
system design and encoding that in the system specification
and the software specification.

Programmers would have zero ability to deviate from implementing
the software spec, full stop. If they did knowingly deviate, it
would be a career ending decision - at best.

Aerospace engineers have lost their pension for far less
serious deviations, even though they had zero consequences.

https://philip.greenspun.com/blog/2019/03/21/optional-angle-of-attack-sensors-on-the-boeing-737-max/

Given dual sensors, why would any sane person decide to alternate
using one per flight?

Agreed. Especially given the poor reliability of AoA sensors.

The people that write and signed off that spec
bear a lot of responsibility

A programmer would have to be awfully thick to not object to that.

The programmer's job is to implement the spec, not to write it

They may have objected, and may have been overruled.

Have you worked in large software organisations?

Not in, but with. Most "just do our jobs", which means that they don't
care to learn much about the process that they are implementing.

Seen that, and it even occurs within software world:
-analysts lob spec over wall to developers
-developers lob code over wall to testers
-developers lob tested code over wall to operations
-rinse and repeat, slowly

"Devops" tries to avoid that inefficiency.

And the hardware guys don't have much insight or visibility into the
software. Often, not much control either, in a large organization
where things are very firewalled.

I've turned down job offers where the HR droids couldn't
deal with someone that successfully straddles both
hardware and software worlds.

> Recipe for disaster.

Yup, as we've seen.

mpm · Jan 14, 2020

On Saturday, January 11, 2020 at 1:10:56 AM UTC-5, Rick C wrote:

I think that is a load. Hardware often fouls up. The two space shuttle disasters were both hardware problems and both were preventable, but there was a clear lack of rigor in the design and execution. The Apollo 13 accident was hardware. The list goes on and on.

Then your very example of the Boeing plane is wrong because no one has said the cause of the accident was improperly coded software.

Technically, one of those shuttle disasters was due to management not listening to their engineers, including those at Morton-Thiokol, that the booster rocket O-Rings were unsafe to launch at cold temperature.

I don't consider that to be a "hardware problem" so much as an arrogantly stupid decision to launch under known, unsafe conditions.

As for the tiles (2nd shuttle loss), I am weirdly reminded of the Siegfried & Roy Vegas act with the white lions and tigers. They insured against every conceivable possibility (including the performance animals jumping into the crowd and causing a panic!). Everything that is, except the tiger viciously attacking Roy Horn on-stage.

You think you could see that coming..., or at least have a plan (however remote the possibility)?

With the shuttle heat tiles, NASA had to replace a lot of those after every flight. Did they never see the tiger?

design rigor: electronics vs. software

Guest

Guest

Phil Hobbs

Guest

John Larkin

Guest

John Larkin

Guest

Tom Gardner

Guest

Guest

RBlack

Guest

Phil Hobbs

Guest

John Larkin

Guest

John Larkin

Guest

Tom Gardner

Guest

Tom Gardner

Guest

Tom Gardner

Guest

John Larkin

Guest

Rick C

Guest

John Larkin

Guest

Rick C

Guest

Tom Gardner

Guest

mpm

Guest

Log in

Welcome to EDABoard.com

Sponsor