design rigor: electronics vs. software

On 1/11/20 9:47 AM, jlarkin@highlandsniptechnology.com wrote:
On Fri, 10 Jan 2020 21:46:19 -0800 (PST), omnilobe@gmail.com wrote:

Hardware designs are more rigorously done than
software designs. A large company had problems with a 737
and a rocket to the space station...

https://www.bloomberg.com/news/articles/2019-06-28/boeing-s-737-max-software-outsourced-to-9-an-hour-engineers

I know programmers who do not care for rigor at home at work.
I did hardware design with rigor and featuring reviews by caring
electronics design engineers and marketing engineers.

Software gets sloppy with OOPs.
Object Oriented Programming.
Windows 10 on a rocket to ISS space station.
C++ mud.

The easier it is to change things, the less careful people are about
doing them. Software, which includes FPGA code, seldom works the first
time. Almost never. The average hunk of fresh code has a mistake
roughly every 10 lines. Or was that three?

FPGAs are usually better than procedural code, but are still mostly
done as hack-and-fix cycles, with software test benches. When we did
OTP (fuse based) FPGAs without test benching, we often got them right
first try. If compiles took longer, people would be more careful.

PCBs usually work the first time, because they are checked and
reviewed, and that is because mistakes are slow and expensive to fix,
and very visible to everyone. Bridges and buildings are almost always
right the first time. They are even more expensive and slow and
visible.

Besides, electronics and structures have established theory, but
software doesn't. Various people just sort of do it.

My Spice sims are often wrong initially, precisely because there are
basically no consequences to running the first try without much
checking. That is of course dangerous; we don't want to base a
hardware design on a sim that runs and makes pretty graphs but is
fundamentally wrong.

Don't know why C++ is getting the rap here. Modern C++ design is
rigorous, there are books about what to do and what not to do, and the
language has built-in facilities to ensure that e.g. memory is never
leaked, pointers always refer to an object that exists, and the user
can't ever add feet to meters if they're not supposed to.

If the developer chooses to ignore it all like they always know better
than the people who wrote the books on it, well, God bless...

Embedded software is likely more reliable than ever, believe it or not.
The infotainment system in my Chevy has crashed once in three years,
99.999% reliable. There's probably a million lines of C++ behind the
scenes of that thing. Does Chevy employ the best coders in the world?
Probably not.
 
On Saturday, January 11, 2020 at 5:03:43 PM UTC-5, John Larkin wrote:
On Sat, 11 Jan 2020 18:08:32 +0000 (UTC),
DecadentLinuxUserNumeroUno@decadence.org wrote:

Rick C <gnuarm.deletethisbit@gmail.com> wrote in news:cb3bf0ea-0cdc-
42d7-b402-90ff2190683b@googlegroups.com:

Do your board layout people know how the products function?

What a stupid question.

I would not hired layout staff that were unable to understand the
board they were laying out. It is REQUIRED, especially if analog
signals are included.

Board layout people? More like PCB design engineers. It's not just
about the schematic and what that engineer put into the circuit. Where
the parts get laid and their traces matter.

It ain't just point a to point b.

I don't think that any of my PCB layout people understood electronics.
They do learn about trace widths and impedances and manufacturing
issues, but I have to get them started on each layout, and I usually
place+route the tricky parts myself.

My three best layout people were women with no engineering background.
The Brat was/is the best, and she majored in softball and beer pong.

She did this one.

https://www.dropbox.com/s/w7ulg68pvni3hpf/Tem_Plus_PCB.JPG?raw=1

I thought the traces from the ADCs up into the FPGA were especially
elegent. I let her pick the BGA balls for best routing.

That board is duck soup to lay out. It literally would take very little skill to do it. Mostly it is easy because there is so much space to work in all you need to do is route the signals.

Can't really see detail around the BGA, but it looks like either no vias or via in pad. Either way, the traces are fat enough you probably route them to the outer two rows only. That really makes life easy with BGA routing.

I try to avoid BGAs because of the fine design rules it places on the boards. Maybe that's no big deal with some fab houses, but in general going below 5/5 design rules starts to cost more in the board. Also the micro vias run the price up too.

--

Rick C.

+- Get 1,000 miles of free Supercharging
+- Tesla referral code - https://ts.la/richard11209
 
On Saturday, January 11, 2020 at 5:31:33 PM UTC-5, bitrex wrote:
The infotainment system in my Chevy has crashed once in three years,
99.999% reliable. There's probably a million lines of C++ behind the
scenes of that thing. Does Chevy employ the best coders in the world?
Probably not.

Maybe the best in India... I'm just sayin'...

--

Rick C.

++ Get 1,000 miles of free Supercharging
++ Tesla referral code - https://ts.la/richard11209
 
On 2020/01/11 2:31 p.m., bitrex wrote:
On 1/11/20 9:47 AM, jlarkin@highlandsniptechnology.com wrote:
On Fri, 10 Jan 2020 21:46:19 -0800 (PST), omnilobe@gmail.com wrote:

Hardware designs are more rigorously done than
software designs. A large company had problems with a 737
and a rocket to the space station...
...

Embedded software is likely more reliable than ever, believe it or not.
The infotainment system in my Chevy has crashed once in three years,
99.999% reliable. There's probably a million lines of C++ behind the
scenes of that thing. Does Chevy employ the best coders in the world?
Probably not.

If your car crashed once every three years due to software glitches I
don't think you would be as impressed...

John :-#(#
 
Rick C wrote...
That board is duck soup to lay out.

I dunno, a 176-pin PLCCC and a 256-pin BGA, plus
lots of other critical stuff, that's not so clear.

Anyway, I think John made his point.


--
Thanks,
- Win
 
On Saturday, January 11, 2020 at 8:50:25 PM UTC-5, Winfield Hill wrote:
Rick C wrote...

That board is duck soup to lay out.

I dunno, a 176-pin PLCCC and a 256-pin BGA, plus
lots of other critical stuff, that's not so clear.

I don't follow your thinking. The size of the parts aren't important if there is lots of space to run the traces. The 176 pin QFp is trivial really. Notice it only has connections to a couple of dozen pads.

This board was stuffed to the gills with parts on both sides and a very, very challenging layout. The rev 1.1 board was in production and some upgrades were requested. The result was barely able to fit on the board. At one point I was ready to give up and I found a way to better overlap pads on the two sides to free up just enough space to complete the routing.

http://arius.com/images/MS-DCARD-2.0_both.png

That was a hard layout.

If I have to redo the board it will require using a BGA unless one of the new FPGA brands offer a part in an appropriate package. The BGA has many more pins with little advantage since I don't need the large number of I/Os. In fact they would make routing harder given the difficulties of fan out on a BGA. That's why having a lot of board space makes routing a snap.

> Anyway, I think John made his point.

And what was that other than showing his design?

--

Rick C.

--- Get 1,000 miles of free Supercharging
--- Tesla referral code - https://ts.la/richard11209
 
On 11 Jan 2020 17:50:07 -0800, Winfield Hill <winfieldhill@yahoo.com>
wrote:

Rick C wrote...

That board is duck soup to lay out.

I dunno, a 176-pin PLCCC and a 256-pin BGA, plus
lots of other critical stuff, that's not so clear.

Anyway, I think John made his point.

There are four photodiode time stampers with 6 ps resolution, and
three delay generators with sub-ps resolution. There's a high-speed
SPI-like link to a control computer, and five more to energy
measurement boxes. Lots of controlled-impedance clocks and signals.

Rev A worked perfectly first try. No breadboards, no prototypes, no
cuts or jumpers. 6 layers.






--

John Larkin Highland Technology, Inc trk

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com
 
On 1/11/20 7:53 PM, John Robertson wrote:
On 2020/01/11 2:31 p.m., bitrex wrote:
On 1/11/20 9:47 AM, jlarkin@highlandsniptechnology.com wrote:
On Fri, 10 Jan 2020 21:46:19 -0800 (PST), omnilobe@gmail.com wrote:

Hardware designs are more rigorously done than
software designs. A large company had problems with a 737
and a rocket to the space station...
...

Embedded software is likely more reliable than ever, believe it or
not. The infotainment system in my Chevy has crashed once in three
years, 99.999% reliable. There's probably a million lines of C++
behind the scenes of that thing. Does Chevy employ the best coders in
the world? Probably not.




If your car crashed once every three years due to software glitches I
don't think you would be as impressed...

John :-#(#

Article is from June of last year. Zero evidence that "outsourced
coders" had anything to do with the 737 Max's fatal problems.

Nope, despite numerous attempts to pin the blame on the backwards
foreigners all the evidence points to the shitheads in question being
the best lily-White American-know-how engineers and managers money could
buy employed at the top levels of the Boeing.
 
On 1/11/20 7:00 PM, Rick C wrote:
On Saturday, January 11, 2020 at 5:31:33 PM UTC-5, bitrex wrote:
The infotainment system in my Chevy has crashed once in three years,
99.999% reliable. There's probably a million lines of C++ behind the
scenes of that thing. Does Chevy employ the best coders in the world?
Probably not.

Maybe the best in India... I'm just sayin'...

Liked article is from June of last year. Zero evidence that "outsourced
coders" had anything to do with the 737 Max's fatal problems.

Nope, despite numerous attempts to pin the blame on the backwards
foreigners all the evidence points to the shitheads in question being
the best lily-White American-know-how engineers and managers money could
buy employed at the top levels of Boeing.
 
On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

They knew that the whole design was a rats nest intended to make the
737-max flyable by people with a couple of hours "training" on an iPad.
It was a triumph of marketing might over good engineering practice.

I have never been in the position of coding software that would actually
kill people but I have been put in the position by aggressive salesmen
where meeting a customers specification would require the repeal of one
or more laws of physics. The guys who sell stuff on a wing and a prayer
typically move on fast enough that after pocketing their quadratic over
target sales bonus they are well out of it before the shit hits the fan.

Did Boeing's
programmers know nothing about how airplanes work? Just grunted out
lines of code?

They get a specification which in the strictest terms possible specifies
what it must do in all cases. Aerospace you would expect every possible
path to be fully tested including the seldom travelled worst case error
recovery ones. Boeing used to be fantastically good at this!

Snag is if someone changes the maximum allowed limits from a fairly
reasonable 0.6 degree to a larger 2.5 degrees then all bets are off. The
code would have been fine with the original 0.6 degree adjustment limit
told to the FAA and other international flight safety organisations.

--
Regards,
Martin Brown
 
On 12/01/20 16:58, Martin Brown wrote:
I have never been in the position of coding software that would actually kill
people but I have been put in the position by aggressive salesmen where meeting
a customers specification would require the repeal of one or more laws of
physics.

I expect everybody here has seen that.

Useful phrases include "that's great; how did you solve the
Byzantine general's problem?", and similar.


The guys who sell stuff on a wing and a prayer typically move on fast
enough that after pocketing their quadratic over target sales bonus they are
well out of it before the shit hits the fan.

Yup, seen that too, and not just w.r.t. software!

Trying to change the culture so they don't get their
bonus until after customer acceptance (or even
engineering sign off) is an exercise in futility.

Related point: all sales forecasts climb rapidly
after 2 years. No need to guess why.
 
Martin Brown <'''newspam'''@nezumi.demon.co.uk> wrote in
news:qvfj7v$fl6$1@gioia.aioe.org:

Not necessarily. The code written may well have exactly
implemented the algorithm(s) that the clowns supervised by monkeys
specified. It isn't the job of programmers to double check the
workings of the people who do the detailed calculations of
aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering,
failure analysis and aerodynamics calculations are incorrect in
some way!

"the programmers" at those levels likely DO have to do some of the
calculations in the crafting of their code.

Shit C coders and "Aerodynamic Engineers with coding acumen" are
two different things.
 
On 2020-01-12 11:58, Martin Brown wrote:
On 11/01/2020 14:57, jlarkin@highlandsniptechnology.com wrote:
On 11 Jan 2020 05:57:59 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

Then your very example of the Boeing plane is wrong
because no one has said the cause of the accident
was improperly coded software.

Yes, it was an improper spec, with dangerous reliance
on poor hardware.

If code kills people, it was improperly coded.

Not necessarily. The code written may well have exactly implemented the
algorithm(s) that the clowns supervised by monkeys specified. It isn't
the job of programmers to double check the workings of the people who do
the detailed calculations of aerodynamic force vectors and torques.

It is not the programmers fault if the systems engineering, failure
analysis and aerodynamics calculations are incorrect in some way!

That's a bit facile, I think. Folks who take an interest in their
professions aren't that easy to confine that way.

Back in my one foray into big-system design, we design engineers were
always getting in the systems guys' faces about various pieces of
stupidity in the specs. It was all pretty good-natured, and we wound up
with the pain and suffering distributed about equally.



Cheers

Phil Hobbs

--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com
 
Phil Hobbs <pcdhSpamMeSenseless@electrooptical.net> wrote in
news:fb4888b5-e96f-1145-85e8-bc382c9bdcdf@electrooptical.net:

Back in my one foray into big-system design, we design engineers
were always getting in the systems guys' faces about various
pieces of stupidity in the specs. It was all pretty good-natured,
and we wound up with the pain and suffering distributed about
equally.

That is how men get work done... even 'the programmers'.
Very well said, there.

That is like the old dig on 'the hourly help'.

Some programmers are very smart. Others not so much.

I guess choosing to go into it is not such a smart move so they
take a hit from the start. :)
 
On Sunday, January 12, 2020 at 3:32:06 PM UTC-5, DecadentLinux...@decadence..org wrote:
Phil Hobbs <pcdhSpamMeSenseless@electrooptical.net> wrote in
news:fb4888b5-e96f-1145-85e8-bc382c9bdcdf@electrooptical.net:

Back in my one foray into big-system design, we design engineers
were always getting in the systems guys' faces about various
pieces of stupidity in the specs. It was all pretty good-natured,
and we wound up with the pain and suffering distributed about
equally.



That is how men get work done... even 'the programmers'.
Very well said, there.

That is like the old dig on 'the hourly help'.

Some programmers are very smart. Others not so much.

I guess choosing to go into it is not such a smart move so they
take a hit from the start. :)

If that is how men get work done then they are not using software and system engineering techniques developed in the last 15-20 years and their results are *still* subject to the same types of errors. I do research and teach in this area. A number of studies, and one in particular, cites up to 70% of software faults are introduced on the LHS of the 'V' development model (Other software design lifecycle models have similar fault percentages.) A major issue is that most of these errors are observed at integration time (software+software, software+hardware). The cost of defect removal along the RHS of the 'V' development model is anywhere from 50-200X of the removal cost along the LHS of the 'V'. (no wonder why systems cost so much)
The talk about errors in this thread are very high level and most ppl have the mindset that they are thinking about errors at the unit test level. There are numerous techniques developed to identify and fix fault types thorughout the entire development lifecycle but regrettably a lot of they are not employed. Actually a large percentage of the errors are discovered and fixed at that level. Errors of the type: units mismatch, variable type mismatch, and a slew of concurrency issues aren't discovered till integration time. Usually, at that point, there is a 'rush' to get the system fielded. The horror stories and lessons learned are well documented.
IDK what exactly happened (yet) with the Boeing MAX development. I do have info from some sources that cannot be disclosed at this point. From what I've read, there were major mistakes made from inception through implementation and integration. My personal view, is that one should almost never (never?) place the task on software to correct an inherently unstable airframe design - it is putting a bandaid on the source of the problem. Another major issue is the hazard analysis and fault tolerance approach was not done at the system (the redundancy approach was pitiful, as well as the *logic* used in implementing it as well as conceptual.
I do think that the better software engineers do have a more holistic view of the system (hardware knowledge + system operational knowledge) which will allow them to ask questions when things don't 'seem right.' OTHO, the software engineers should not go making assumptions about things and coding to those assumptions. (It happens more than you think) It is the job of the software architect to ensure that any development assumptions are captured and specified in the software architecture.
In studies I have looked at, the percentage of requirements errors is somewhere between 30-40% of the overall number of faults during the design lifecycle, and the 'industry standard' approach approach to dealing with this problem is woefully indequate despite techniques to detect and remove the errors. A LOT Of time is spent doing software requirements tracing as opposed to doing verification of requirements. People argue that one cannot verify the requirements until the system has been built - which is complete BS but industry is very slow to change. We have shown that using software architecture modeling addresses a large percentage of system level problems early in the design life cycle. We are trying to convince industry. Until change happens, the parade of failures like the MAX will continue.
 
On 2020-01-12 17:38, jjhudak4@gmail.com wrote:
On Sunday, January 12, 2020 at 3:32:06 PM UTC-5,
DecadentLinux...@decadence..org wrote:
Phil Hobbs <pcdhSpamMeSenseless@electrooptical.net> wrote in
news:fb4888b5-e96f-1145-85e8-bc382c9bdcdf@electrooptical.net:

Back in my one foray into big-system design, we design engineers
were always getting in the systems guys' faces about various
pieces of stupidity in the specs. It was all pretty
good-natured, and we wound up with the pain and suffering
distributed about equally.



That is how men get work done... even 'the programmers'. Very
well said, there.

That is like the old dig on 'the hourly help'.

Some programmers are very smart. Others not so much.

I guess choosing to go into it is not such a smart move so they
take a hit from the start. :)


If that is how men get work done then they are not using software
and system engineering techniques developed in the last 15-20 years
and their results are *still* subject to the same types of errors. I
do research and teach in this area. A number of studies, and one in
particular, cites up to 70% of software faults are introduced on
the LHS of the 'V' development model (Other software design lifecycle
models have similar fault percentages.) A major issue is that most
of these errors are observed at integration time
(software+software, software+hardware). The cost of defect removal
along the RHS of the 'V' development model is anywhere from 50-200X
of the removal cost along the LHS of the 'V'. (no wonder why systems
cost so much)

Nice rant. Could you tell us more about the 'V' model?

The talk about errors in this thread are very high level and most
ppl have the mindset that they are thinking about errors at the unit
test level. There are numerous techniques developed to identify and
fix fault types throughout the entire development lifecycle but
regrettably a lot of them are not employed.

What sorts of techniques to you use to find problems in the specifications?
Actually a large percentage of the errors are discovered and fixed at
that level. Errors of the type: units mismatch, variable type
mismatch, and a slew of concurrency issues aren't discovered till
integration time. Usually, at that point, there is a 'rush' to get
the system fielded. The horror stories and lessons learned are well
documented.

Yup. Leaving too much stuff for the system integration step is a very
very well-known way to fail.

IDK what exactly happened (yet) with the Boeing MAX development. I
do have info from some sources that cannot be disclosed at this
point. From what I've read, there were major mistakes made from
inception through implementation and integration. My personal view,
is that one should almost never (never?) place the task on software
to correct an inherently unstable airframe design - it is putting a
bandaid on the source of the problem.

It's commonly done, though, isn't it? I remember reading Ben Rich's
book on the Skunk Works, where he says that the F-117's very squirrelly
handling characteristics were fixed up in software to make it a
beautiful plane to fly. That was about 1980.

Another major issue is the hazard analysis and fault tolerance
approach was not done at the system (the redundancy approach was
pitiful, as well as the *logic* used in implementing it as well as
conceptual.

I do think that the better software engineers do have a more
holistic view of the system (hardware knowledge + system operational
knowledge) which will allow them to ask questions when things don't
'seem right.' OTHO, the software engineers should not go making
assumptions about things and coding to those assumptions. (It
happens more than you think) It is the job of the software architect
to ensure that any development assumptions are captured and specified
in the software architecture.

In real life, though, it's super important to have two-way
communications during development, no? My large-system experience was
all hardware (the first civilian satellite DBS system, 1981-83), so
things were quite a bit simpler than in a large software-intensive
system. I'd expect the need for bottom-up communication to be greater
now rather than less.

In studies I have looked at, the percentage of requirements errors
is somewhere between 30-40% of the overall number of faults during
the design lifecycle, and the 'industry standard' approach approach
to dealing with this problem is woefully indequate despite techniques
to detect and remove the errors. A LOT Of time is spent doing
software requirements tracing as opposed to doing verification of
requirements. People argue that one cannot verify the requirements
until the system has been built - which is complete BS but industry
is very slow to change. We have shown that using software
architecture modeling addresses a large percentage of system level
problems early in the design life cycle. We are trying to convince
industry. Until change happens, the parade of failures like the
MAX will continue.

I'd love to hear more about that.

Cheers

Phil Hobbs


--
Dr Philip C D Hobbs
Principal Consultant
ElectroOptical Innovations LLC / Hobbs ElectroOptics
Optics, Electro-optics, Photonics, Analog Electronics
Briarcliff Manor NY 10510

http://electrooptical.net
http://hobbs-eo.com
 
On Sunday, January 12, 2020 at 12:42:54 AM UTC-5, John Larkin wrote:
On 11 Jan 2020 17:50:07 -0800, Winfield Hill <winfieldhill@yahoo.com
wrote:

Rick C wrote...

That board is duck soup to lay out.

I dunno, a 176-pin PLCCC and a 256-pin BGA, plus
lots of other critical stuff, that's not so clear.

Anyway, I think John made his point.

There are four photodiode time stampers with 6 ps resolution, and
three delay generators with sub-ps resolution. There's a high-speed
SPI-like link to a control computer, and five more to energy
measurement boxes. Lots of controlled-impedance clocks and signals.
Wow! That sound like quite a box. Four inputs? What's the dead time
on a channel? More or less than $10k?

George H.
Rev A worked perfectly first try. No breadboards, no prototypes, no
cuts or jumpers. 6 layers.






--

John Larkin Highland Technology, Inc trk

jlarkin att highlandtechnology dott com
http://www.highlandtechnology.com
 
On Sunday, January 12, 2020 at 11:55:08 PM UTC+1, Phil Hobbs wrote:
On 2020-01-12 17:38, jjhudak4@gmail.com wrote:
On Sunday, January 12, 2020 at 3:32:06 PM UTC-5,
DecadentLinux...@decadence..org wrote:
Phil Hobbs <pcdhSpamMeSenseless@electrooptical.net> wrote in
news:fb4888b5-e96f-1145-85e8-bc382c9bdcdf@electrooptical.net:

Back in my one foray into big-system design, we design engineers
were always getting in the systems guys' faces about various
pieces of stupidity in the specs. It was all pretty
good-natured, and we wound up with the pain and suffering
distributed about equally.



That is how men get work done... even 'the programmers'. Very
well said, there.

That is like the old dig on 'the hourly help'.

Some programmers are very smart. Others not so much.

I guess choosing to go into it is not such a smart move so they
take a hit from the start. :)


If that is how men get work done then they are not using software
and system engineering techniques developed in the last 15-20 years
and their results are *still* subject to the same types of errors. I
do research and teach in this area. A number of studies, and one in
particular, cites up to 70% of software faults are introduced on
the LHS of the 'V' development model (Other software design lifecycle
models have similar fault percentages.) A major issue is that most
of these errors are observed at integration time
(software+software, software+hardware). The cost of defect removal
along the RHS of the 'V' development model is anywhere from 50-200X
of the removal cost along the LHS of the 'V'. (no wonder why systems
cost so much)

Nice rant. Could you tell us more about the 'V' model?

I guess he's referring to this one:

https://am7s.com/what-is-v-model7-model-systems-engineering/

We use it at work, or actually used to use it. Now we are transitioning to agile methods, since V model is really rigid and respons poorly to changes during development. In particular SW can benefit a lot from agile mindset, and making automatic test that has high test coverage

Cheers

Klaus
 
On Sunday, January 12, 2020 at 5:39:03 PM UTC-5, jjhu...@gmail.com wrote:
If that is how men get work done then they are not using software and system engineering techniques developed in the last 15-20 years and their results are *still* subject to the same types of errors. I do research and teach in this area. A number of studies, and one in particular, cites up to 70% of software faults are introduced on the LHS of the 'V' development model (Other software design lifecycle models have similar fault percentages.) A major issue is that most of these errors are observed at integration time (software+software, software+hardware). The cost of defect removal along the RHS of the 'V' development model is anywhere from 50-200X of the removal cost along the LHS of the 'V'. (no wonder why systems cost so much)

That reminds me of a fact of designing FPGAs that surprised me when I realized it. We go to great lengths to assure the proper design of the code that goes into logic devices. But an equally important part is the timing of the logic paths. We have constraints that we use to specify the timing requirements which are then used to test the speed of the resulting logic in analysis. However, we don't have a way to verify that the constraints are specifying what we intended. So any logic design can potentially be a failure due to improper timing constraints which can not be tested or verified to be correct.

Go figure!

--

Rick C.

- Get 1,000 miles of free Supercharging
- Tesla referral code - https://ts.la/richard11209
 
On 13/1/20 9:55 am, Phil Hobbs wrote:
On 2020-01-12 17:38, jjhudak4@gmail.com wrote:
The cost of defect removal
along the RHS of the 'V' development model is anywhere from 50-200X
of the removal cost along the LHS of the 'V'. (no wonder why systems
cost so much)

Nice rant.  Could you tell us more about the 'V' model?

The talk about errors in this thread are very high level and most
ppl have the mindset that they are thinking about errors at the unit
test level. There are numerous techniques developed to identify and
fix fault types throughout the  entire development lifecycle but
regrettably a lot of them are not employed.

What sorts of techniques to you use to find problems in the specifications?

See below for pointers to John Hudak's and SEI's work in this area.

There is a number of other approaches that I don't see covered in their
work too.

<https://lamport.azurewebsites.net/tla/tla.html> is one.
<http://factbasedmodeling.org/> is another.

All work on different aspects of verification, but basically they aim to
express (model) the problem in different ways to allow it to be
inspected and tested with hypothesized situations to find anomalies.

FBM looks for static anomalies (a model which allows any situation that
makes no sense). TLA looks for behavioural anomalies (sequence of
actions which could violate a system constraint).
AADL looks for performance/real-time anomalies.

It is the job of the software architect
to ensure that any development assumptions are captured and specified
in the software architecture.

In real life, though, it's super important to have two-way
communications during development, no?  My large-system experience was
all hardware (the first civilian satellite DBS system, 1981-83), so
things were quite a bit simpler than in a large software-intensive
system.  I'd expect the need for bottom-up communication to be greater
now rather than less.

The biggest difficulty with bottom-up communication is that the folk "at
the bottom" work with highly technical or formal artefacts, and feel the
need to communicate in the same way - but the folk who need to
understand what is being said simply don't understand what is being
said, and being frequently more senior, don't want to admit their lack
of understanding.

There is a deep gulf between requirements specification and
implementation. Folk in implementation use their formal methods training
to spot logical errors in the specifications, and assume that the reason
is that the requirements folk simply don't know what they want.
Sometimes they're right, but more often, they simply don't have a
sufficiently precise language to express it.

The gulf can be crossed - but only by formal languages that can be
expressed in understandable ways.

Building tools to cross this language<->logic gulf using so-called
"fact-based modeling" has been the focus of my last 12 years of research.

In studies I have looked at, the percentage of requirements errors
is somewhere between 30-40% of the overall number of faults during
the design lifecycle, and the 'industry standard' approach approach
to dealing with this problem is woefully indequate despite techniques
to detect and remove the errors.  A LOT Of time is spent doing
software requirements tracing as opposed to doing verification of
requirements.  People argue that one cannot verify the requirements
until the system has been built - which is complete BS but industry is
very slow to change. We have shown that using software architecture
modeling addresses a large percentage of system level problems early
in the design life cycle.  We are trying to convince industry.   Until
change happens, the parade of failures like the
MAX will continue.

I'd love to hear more about that.

The Software Engineering Institute at CMU (where John Hudak works) is
one of the foremost (but by no means the only) eminent body working in
this space - nor is their approach the only one that has made
significant inroads into this class of problem.

<https://resources.sei.cmu.edu/asset_files/TechnicalNote/2006_004_001_14678.pdf>
<http://www.openaadl.org/>

Clifford Heath
 

Welcome to EDABoard.com

Sponsor

Back
Top