highest frequency periodic interrupt?...

J

John Larkin

Guest
What\'s the fastest periodic IRQ that you have ever run?

We have one board with 12 isolated LPC1758 ARMs. Each gets interrupted
by its on-chip ADC at 100 KHz and does a bunch of filtering and runs a
PID loop, which outputs to the on-chip DAC. We cranked the CPU clock
down some to save power, so the ISR runs for about 7 usec max.

I ask because if I use a Pi Pico on some new projects, it has a
dual-core 133 MHz CPU, and one core may have enough compute power that
we wouldn\'t need an FPGA in a lot of cases. Might even do DDS in
software.

RP2040 floating point is tempting but probably too slow for control
use. Things seem to take 50 or maybe 100 us. Back to scaled integers,
I guess.

I was also thinking that we could make a 2 or 3-bit DAC with a few
resistors. The IRQ could load that at various places and a scope would
trace execution. That would look cool. On the 1758 thing we brought
out a single bit to a test point and raised that during the ISR so we
could see ISR execution time on a scope. My c guy didn\'t believe that
a useful ISR could run at 100K and had no idea what execution time
might be.
 
On a sunny day (Fri, 13 Jan 2023 15:46:16 -0800) it happened John Larkin
<jlarkin@highlandSNIPMEtechnology.com> wrote in
<q5p3shh8f34tt34ka767750oc2ou8p7vl8@4ax.com>:

What\'s the fastest periodic IRQ that you have ever run?

We have one board with 12 isolated LPC1758 ARMs. Each gets interrupted
by its on-chip ADC at 100 KHz and does a bunch of filtering and runs a
PID loop, which outputs to the on-chip DAC. We cranked the CPU clock
down some to save power, so the ISR runs for about 7 usec max.

I ask because if I use a Pi Pico on some new projects, it has a
dual-core 133 MHz CPU, and one core may have enough compute power that
we wouldn\'t need an FPGA in a lot of cases. Might even do DDS in
software.

RP2040 floating point is tempting but probably too slow for control
use. Things seem to take 50 or maybe 100 us. Back to scaled integers,
I guess.

I was also thinking that we could make a 2 or 3-bit DAC with a few
resistors. The IRQ could load that at various places and a scope would
trace execution. That would look cool. On the 1758 thing we brought
out a single bit to a test point and raised that during the ISR so we
could see ISR execution time on a scope. My c guy didn\'t believe that
a useful ISR could run at 100K and had no idea what execution time
might be.

Well in that sort of thing you need to think in asm, instruction times,
but I have no experience with the RP2040, and little with ASM on ARM.
Should be simple to test how long the C code takes, do you have an RP2040?
Playing with one would be a good starting point.
Should I get one? Was thinking just for fun...
 
On Sat, 14 Jan 2023 04:47:22 GMT, Jan Panteltje
<pNaonStpealmtje@yahoo.com> wrote:

On a sunny day (Fri, 13 Jan 2023 15:46:16 -0800) it happened John Larkin
jlarkin@highlandSNIPMEtechnology.com> wrote in
q5p3shh8f34tt34ka767750oc2ou8p7vl8@4ax.com>:

What\'s the fastest periodic IRQ that you have ever run?

We have one board with 12 isolated LPC1758 ARMs. Each gets interrupted
by its on-chip ADC at 100 KHz and does a bunch of filtering and runs a
PID loop, which outputs to the on-chip DAC. We cranked the CPU clock
down some to save power, so the ISR runs for about 7 usec max.

I ask because if I use a Pi Pico on some new projects, it has a
dual-core 133 MHz CPU, and one core may have enough compute power that
we wouldn\'t need an FPGA in a lot of cases. Might even do DDS in
software.

RP2040 floating point is tempting but probably too slow for control
use. Things seem to take 50 or maybe 100 us. Back to scaled integers,
I guess.

I was also thinking that we could make a 2 or 3-bit DAC with a few
resistors. The IRQ could load that at various places and a scope would
trace execution. That would look cool. On the 1758 thing we brought
out a single bit to a test point and raised that during the ISR so we
could see ISR execution time on a scope. My c guy didn\'t believe that
a useful ISR could run at 100K and had no idea what execution time
might be.

Well in that sort of thing you need to think in asm, instruction times,
but I have no experience with the RP2040, and little with ASM on ARM.
Should be simple to test how long the C code takes, do you have an RP2040?

I got a few Pi Picos but haven\'t run them.

Playing with one would be a good starting point.
Should I get one? Was thinking just for fun...

I recently got a Pi4B \"development system\" from Amazon. Add a keyboard
and a monitor and a mouse and it will compile and debug programs for
the Pico. It runs their OS right out the box.

https://www.amazon.com/dp/B0BB912MV1?th=1

The enclosure is a nightmare so I threw that away. Just run the board.
It doesn\'t seem to need the fan.

There\'s a book too

https://www.amazon.com/dp/187196279X

I\'ll delegate the actual coding. I just wanted to see what the process
is like. It\'s impressive.
 
On a sunny day (Fri, 13 Jan 2023 21:08:08 -0800) it happened John Larkin
<jlarkin@highlandSNIPMEtechnology.com> wrote in
<qad4sh12oqlnmr5l6phv3n3mds658usq59@4ax.com>:

On Sat, 14 Jan 2023 04:47:22 GMT, Jan Panteltje
pNaonStpealmtje@yahoo.com> wrote:

On a sunny day (Fri, 13 Jan 2023 15:46:16 -0800) it happened John Larkin
jlarkin@highlandSNIPMEtechnology.com> wrote in
q5p3shh8f34tt34ka767750oc2ou8p7vl8@4ax.com>:

What\'s the fastest periodic IRQ that you have ever run?

We have one board with 12 isolated LPC1758 ARMs. Each gets interrupted
by its on-chip ADC at 100 KHz and does a bunch of filtering and runs a
PID loop, which outputs to the on-chip DAC. We cranked the CPU clock
down some to save power, so the ISR runs for about 7 usec max.

I ask because if I use a Pi Pico on some new projects, it has a
dual-core 133 MHz CPU, and one core may have enough compute power that
we wouldn\'t need an FPGA in a lot of cases. Might even do DDS in
software.

RP2040 floating point is tempting but probably too slow for control
use. Things seem to take 50 or maybe 100 us. Back to scaled integers,
I guess.

I was also thinking that we could make a 2 or 3-bit DAC with a few
resistors. The IRQ could load that at various places and a scope would
trace execution. That would look cool. On the 1758 thing we brought
out a single bit to a test point and raised that during the ISR so we
could see ISR execution time on a scope. My c guy didn\'t believe that
a useful ISR could run at 100K and had no idea what execution time
might be.

Well in that sort of thing you need to think in asm, instruction times,
but I have no experience with the RP2040, and little with ASM on ARM.
Should be simple to test how long the C code takes, do you have an RP2040?

I got a few Pi Picos but haven\'t run them.

Playing with one would be a good starting point.
Should I get one? Was thinking just for fun...

I recently got a Pi4B \"development system\" from Amazon. Add a keyboard
and a monitor and a mouse and it will compile and debug programs for
the Pico. It runs their OS right out the box.

https://www.amazon.com/dp/B0BB912MV1?th=1

That 230 USD is a LOT of money!
Amazon is trying to profit from the PI4 shortage it seems.
Its not evene a 8GB,

Payed about 100 USD for my Pi4 4 GB and my Pi4 8 GB just 2 years ago December 2020,
including SDcard, RapiOS, plastic housing, cables, cooling fins and supply.

No fan, it does run hot, about 70 C.
But I use that one for web browsing.
The older one with 4 GB memory has an ebay metal housing and a fan.
After lubricating that fan with vaseline it now has run quiet for 4 years?
The metal housing also stops any WiFi, as that one is part of the security
system and no WiFi allowed there.
It runs 24/7 recording 6 cameras, 2 audio channels, weather sensors (temp, air pressure, humidity
airtraffic, ship traffic, radiation etc (from an even older rRaspberry Pi that works as server) ..
http://panteltje.com/panteltje/xgpspc/index.html
Each Pi4 has a 4 TB Toshiba USB harddisk connected to it.




The enclosure is a nightmare so I threw that away. Just run the board.
It doesn\'t seem to need the fan.

Type this in a terminal to see the current temperature:
vcgencmd measure_temp

For more info on that command:
man vcgencmd

Maybe be intersting:
vcgencmd measure_clock

Here now:
raspberrypi: ~ # vcgencmd measure_clock arm
frequency(48)=700207040




There\'s a book too

https://www.amazon.com/dp/187196279X

I\'ll delegate the actual coding. I just wanted to see what the process
is like. It\'s impressive.

I like the GPIO I/O, so many things you can do with that,
 
On Saturday, 14 January 2023 at 06:30:16 UTC, Jan Panteltje wrote:
On a sunny day (Fri, 13 Jan 2023 21:08:08 -0800) it happened John Larkin
jla...@highlandSNIPMEtechnology.com> wrote in
qad4sh12oqlnmr5l6...@4ax.com>:
On Sat, 14 Jan 2023 04:47:22 GMT, Jan Panteltje
pNaonSt...@yahoo.com> wrote:

On a sunny day (Fri, 13 Jan 2023 15:46:16 -0800) it happened John Larkin
jla...@highlandSNIPMEtechnology.com> wrote in
q5p3shh8f34tt34ka...@4ax.com>:

What\'s the fastest periodic IRQ that you have ever run?

We have one board with 12 isolated LPC1758 ARMs. Each gets interrupted
by its on-chip ADC at 100 KHz and does a bunch of filtering and runs a
PID loop, which outputs to the on-chip DAC. We cranked the CPU clock
down some to save power, so the ISR runs for about 7 usec max.

I ask because if I use a Pi Pico on some new projects, it has a
dual-core 133 MHz CPU, and one core may have enough compute power that
we wouldn\'t need an FPGA in a lot of cases. Might even do DDS in
software.

RP2040 floating point is tempting but probably too slow for control
use. Things seem to take 50 or maybe 100 us. Back to scaled integers,
I guess.

I was also thinking that we could make a 2 or 3-bit DAC with a few
resistors. The IRQ could load that at various places and a scope would
trace execution. That would look cool. On the 1758 thing we brought
out a single bit to a test point and raised that during the ISR so we
could see ISR execution time on a scope. My c guy didn\'t believe that
a useful ISR could run at 100K and had no idea what execution time
might be.

Well in that sort of thing you need to think in asm, instruction times,
but I have no experience with the RP2040, and little with ASM on ARM.
Should be simple to test how long the C code takes, do you have an RP2040?

I got a few Pi Picos but haven\'t run them.

Playing with one would be a good starting point.
Should I get one? Was thinking just for fun...

I recently got a Pi4B \"development system\" from Amazon. Add a keyboard
and a monitor and a mouse and it will compile and debug programs for
the Pico. It runs their OS right out the box.

https://www.amazon.com/dp/B0BB912MV1?th=1
That 230 USD is a LOT of money!
Amazon is trying to profit from the PI4 shortage it seems.
Its not evene a 8GB,

Payed about 100 USD for my Pi4 4 GB and my Pi4 8 GB just 2 years ago December 2020,
including SDcard, RapiOS, plastic housing, cables, cooling fins and supply.

No fan, it does run hot, about 70 C.
But I use that one for web browsing.
The older one with 4 GB memory has an ebay metal housing and a fan.
After lubricating that fan with vaseline it now has run quiet for 4 years?
The metal housing also stops any WiFi, as that one is part of the security
system and no WiFi allowed there.
It runs 24/7 recording 6 cameras, 2 audio channels, weather sensors (temp, air pressure, humidity
airtraffic, ship traffic, radiation etc (from an even older rRaspberry Pi that works as server) ..
http://panteltje.com/panteltje/xgpspc/index.html
Each Pi4 has a 4 TB Toshiba USB harddisk connected to it.
The enclosure is a nightmare so I threw that away. Just run the board.
It doesn\'t seem to need the fan.
Type this in a terminal to see the current temperature:
vcgencmd measure_temp

For more info on that command:
man vcgencmd

Maybe be intersting:
vcgencmd measure_clock

Here now:
raspberrypi: ~ # vcgencmd measure_clock arm
frequency(48)=700207040
There\'s a book too

https://www.amazon.com/dp/187196279X

I\'ll delegate the actual coding. I just wanted to see what the process
is like. It\'s impressive.
I like the GPIO I/O, so many things you can do with that,

It isn\'t always necessary to use interrupts. Periodic i/o can be done
using DMA and the programmable state machines can do a lot too.
The floating point code is stored in zero wait-state masked rom
so it isn\'t affected by cache misses as it would be if executing
from the flash memory.
For development it is worth looking at the RP400 which is an RP4
built into a keyboard. It is by default clocked faster than the standard
RP4 because there is a large internal heat sink for the cpu. Sometimes
the RP400 has had better availability. There are keyboard layouts
for various countries.
John
 
On 13/01/2023 23:46, John Larkin wrote:
> What\'s the fastest periodic IRQ that you have ever run?

Usually try to avoid having fast periodic IRQs in favour of offloading
them onto some dedicated hardware. But CPUs were slower then than now.
We have one board with 12 isolated LPC1758 ARMs. Each gets interrupted
by its on-chip ADC at 100 KHz and does a bunch of filtering and runs a
PID loop, which outputs to the on-chip DAC. We cranked the CPU clock
down some to save power, so the ISR runs for about 7 usec max.

I ask because if I use a Pi Pico on some new projects, it has a
dual-core 133 MHz CPU, and one core may have enough compute power that
we wouldn\'t need an FPGA in a lot of cases. Might even do DDS in
software.

RP2040 floating point is tempting but probably too slow for control
use. Things seem to take 50 or maybe 100 us. Back to scaled integers,
I guess.

It might be worth benchmarking how fast the FPU really is on that device
(for representative sample code). The Intel i5 & i7 can do all except
divide in a single cycle these days - I don\'t know what Arm is like in
this respect. You get some +*- for free close to every divide too.

*BIG* time penalty for having two divides or branches too close
together. Worth playing around to find patterns the CPU does well.

Beware that what you measure gets controlled but for polynomials up to 5
term or rationals up to about 5,2 call overhead may dominate the
execution time (particularly if the stupid compiler puts a 16byte
structure across a cache boundary on the stack).

Forcing inlining of small code sections can help. DO it to excess and it
will slow things down - there is a sweet spot. Loop unrolling is much
less useful these days now that branch prediction is so good.

I was also thinking that we could make a 2 or 3-bit DAC with a few
resistors. The IRQ could load that at various places and a scope would
trace execution. That would look cool. On the 1758 thing we brought
out a single bit to a test point and raised that during the ISR so we
could see ISR execution time on a scope. My c guy didn\'t believe that
a useful ISR could run at 100K and had no idea what execution time
might be.

ISR code is generally very short and best done in assembler if you want
it as quick as possible. Examining the code generation of GCC is
worthwhile since it sucks compared to Intel(better) and MS (best).

In my tests GCC is between 30% and 3x slower than Intel or MS for C/C++
when generating Intel CPU specific SIMD code with maximum optimisation.

MS compiler still does pretty stupid things like internal compiler
generated SIMD objects of 128, 256 or 512 bits (16, 33 or 64 byte) and
having them crossing a cache line boundary.

--
Regards,
Martin Brown
 
On Sat, 14 Jan 2023 06:27:45 GMT, Jan Panteltje
<pNaonStpealmtje@yahoo.com> wrote:

On a sunny day (Fri, 13 Jan 2023 21:08:08 -0800) it happened John Larkin
jlarkin@highlandSNIPMEtechnology.com> wrote in
qad4sh12oqlnmr5l6phv3n3mds658usq59@4ax.com>:

On Sat, 14 Jan 2023 04:47:22 GMT, Jan Panteltje
pNaonStpealmtje@yahoo.com> wrote:

On a sunny day (Fri, 13 Jan 2023 15:46:16 -0800) it happened John Larkin
jlarkin@highlandSNIPMEtechnology.com> wrote in
q5p3shh8f34tt34ka767750oc2ou8p7vl8@4ax.com>:

What\'s the fastest periodic IRQ that you have ever run?

We have one board with 12 isolated LPC1758 ARMs. Each gets interrupted
by its on-chip ADC at 100 KHz and does a bunch of filtering and runs a
PID loop, which outputs to the on-chip DAC. We cranked the CPU clock
down some to save power, so the ISR runs for about 7 usec max.

I ask because if I use a Pi Pico on some new projects, it has a
dual-core 133 MHz CPU, and one core may have enough compute power that
we wouldn\'t need an FPGA in a lot of cases. Might even do DDS in
software.

RP2040 floating point is tempting but probably too slow for control
use. Things seem to take 50 or maybe 100 us. Back to scaled integers,
I guess.

I was also thinking that we could make a 2 or 3-bit DAC with a few
resistors. The IRQ could load that at various places and a scope would
trace execution. That would look cool. On the 1758 thing we brought
out a single bit to a test point and raised that during the ISR so we
could see ISR execution time on a scope. My c guy didn\'t believe that
a useful ISR could run at 100K and had no idea what execution time
might be.

Well in that sort of thing you need to think in asm, instruction times,
but I have no experience with the RP2040, and little with ASM on ARM.
Should be simple to test how long the C code takes, do you have an RP2040?

I got a few Pi Picos but haven\'t run them.

Playing with one would be a good starting point.
Should I get one? Was thinking just for fun...

I recently got a Pi4B \"development system\" from Amazon. Add a keyboard
and a monitor and a mouse and it will compile and debug programs for
the Pico. It runs their OS right out the box.

https://www.amazon.com/dp/B0BB912MV1?th=1

That 230 USD is a LOT of money!

It\'s nothing compared to setting up a big-box PC and installing an OS
and compilers and libraries for some chip that will be EOL soon. $230
is dinner for four around here at a middle-good restaurant.

Amazon is trying to profit from the PI4 shortage it seems.
Its not evene a 8GB,

It comes with a power wart and cables and the OS on an SD card. It\'s
not worth shopping around for all that, if your time is worth
anything.


Payed about 100 USD for my Pi4 4 GB and my Pi4 8 GB just 2 years ago December 2020,
including SDcard, RapiOS, plastic housing, cables, cooling fins and supply.

No fan, it does run hot, about 70 C.
But I use that one for web browsing.
The older one with 4 GB memory has an ebay metal housing and a fan.
After lubricating that fan with vaseline it now has run quiet for 4 years?
The metal housing also stops any WiFi, as that one is part of the security
system and no WiFi allowed there.
It runs 24/7 recording 6 cameras, 2 audio channels, weather sensors (temp, air pressure, humidity
airtraffic, ship traffic, radiation etc (from an even older rRaspberry Pi that works as server) ..
http://panteltje.com/panteltje/xgpspc/index.html
Each Pi4 has a 4 TB Toshiba USB harddisk connected to it.




The enclosure is a nightmare so I threw that away. Just run the board.
It doesn\'t seem to need the fan.

Type this in a terminal to see the current temperature:
vcgencmd measure_temp

Fingers are easier.


For more info on that command:
man vcgencmd

Maybe be intersting:
vcgencmd measure_clock

Here now:
raspberrypi: ~ # vcgencmd measure_clock arm
frequency(48)=700207040




There\'s a book too

https://www.amazon.com/dp/187196279X

I\'ll delegate the actual coding. I just wanted to see what the process
is like. It\'s impressive.

I like the GPIO I/O, so many things you can do with that,
 
On a sunny day (Sat, 14 Jan 2023 08:31:33 -0800) it happened John Larkin
<jlarkin@highlandSNIPMEtechnology.com> wrote in
<tol5shtb7chchpkq63hnb1mfsveolk1tib@4ax.com>:

On Sat, 14 Jan 2023 06:27:45 GMT, Jan Panteltje
pNaonStpealmtje@yahoo.com> wrote:

Payed about 100 USD for my Pi4 4 GB and my Pi4 8 GB just 2 years ago December 2020,
including SDcard, RapiOS, plastic housing, cables, cooling fins and supply.

No fan, it does run hot, about 70 C.
But I use that one for web browsing.
The older one with 4 GB memory has an ebay metal housing and a fan.
After lubricating that fan with vaseline it now has run quiet for 4 years?
The metal housing also stops any WiFi, as that one is part of the security
system and no WiFi allowed there.
It runs 24/7 recording 6 cameras, 2 audio channels, weather sensors (temp, air pressure, humidity
airtraffic, ship traffic, radiation etc (from an even older rRaspberry Pi that works as server) ..
http://panteltje.com/panteltje/xgpspc/index.html
Each Pi4 has a 4 TB Toshiba USB harddisk connected to it.




The enclosure is a nightmare so I threw that away. Just run the board.
It doesn\'t seem to need the fan.

Type this in a terminal to see the current temperature:
vcgencmd measure_temp


Fingers are easier.

This is from google:
For Raspberry Pi 3+, a \'soft\' temperature limit of 60°C has been introduced.]
This means that even before reaching the hard limit at 85°C, the clock speed is reduced from 1.4GHz to lower frequencies, reducing the temperatu
and
That is the so-called throttling. The Raspberry Pi monitors the temperature continuously.
Above 82 °C (180 °F), the clock frequency is automatically lowered, regardless of which flag is set. This action will reduce heat

So better use vcgencmd and it saves your finger too from getting fried.
I should actually get a better housing with fan for my Pi4 8 GB like I have for my Pi4 4 GB that runs at about 46 Degrees C.
Of course maybe bringing your own fried finger to a restaurant ?? ..Discount?

So if you need high speed and have a high processor load then get a decent cooling.

For more info on that command:
man vcgencmd

Maybe be intersting:
vcgencmd measure_clock

Here now:
raspberrypi: ~ # vcgencmd measure_clock arm
frequency(48)=700207040
 
On Sat, 14 Jan 2023 15:52:49 +0000, Martin Brown
<\'\'\'newspam\'\'\'@nonad.co.uk> wrote:

On 13/01/2023 23:46, John Larkin wrote:
What\'s the fastest periodic IRQ that you have ever run?

Usually try to avoid having fast periodic IRQs in favour of offloading
them onto some dedicated hardware. But CPUs were slower then than now.

We have one board with 12 isolated LPC1758 ARMs. Each gets interrupted
by its on-chip ADC at 100 KHz and does a bunch of filtering and runs a
PID loop, which outputs to the on-chip DAC. We cranked the CPU clock
down some to save power, so the ISR runs for about 7 usec max.

I ask because if I use a Pi Pico on some new projects, it has a
dual-core 133 MHz CPU, and one core may have enough compute power that
we wouldn\'t need an FPGA in a lot of cases. Might even do DDS in
software.

RP2040 floating point is tempting but probably too slow for control
use. Things seem to take 50 or maybe 100 us. Back to scaled integers,
I guess.

It might be worth benchmarking how fast the FPU really is on that device
(for representative sample code). The Intel i5 & i7 can do all except
divide in a single cycle these days - I don\'t know what Arm is like in
this respect. You get some +*- for free close to every divide too.

The RP2040 chip has FP routines in the rom, apparently code with some
sorts of hardware assist, but it\'s callable subroutines and not native
instructions to a hardware FP engine. When it returns it\'s done.

Various web sites seem to confuse microseconds and nanoseconds. 150 us
does seem slow for a \"fast\" fp operation. We\'ll have to do
experiments.

I wrote one math package for the 68K, with the format signed 32.32.
That behaved just like floating point in real life, but was small and
fast and avoided drecky scaled integers.

*BIG* time penalty for having two divides or branches too close
together. Worth playing around to find patterns the CPU does well.

Without true hardware FP, call locations probably don\'t matter.

Beware that what you measure gets controlled but for polynomials up to 5
term or rationals up to about 5,2 call overhead may dominate the
execution time (particularly if the stupid compiler puts a 16byte
structure across a cache boundary on the stack).

We occasionally use polynomials, but 2nd order and rarely 3rd is
enough to get analog i/o close enough.

Forcing inlining of small code sections can help. DO it to excess and it
will slow things down - there is a sweet spot. Loop unrolling is much
less useful these days now that branch prediction is so good.

I was also thinking that we could make a 2 or 3-bit DAC with a few
resistors. The IRQ could load that at various places and a scope would
trace execution. That would look cool. On the 1758 thing we brought
out a single bit to a test point and raised that during the ISR so we
could see ISR execution time on a scope. My c guy didn\'t believe that
a useful ISR could run at 100K and had no idea what execution time
might be.

ISR code is generally very short and best done in assembler if you want
it as quick as possible. Examining the code generation of GCC is
worthwhile since it sucks compared to Intel(better) and MS (best).

In my tests GCC is between 30% and 3x slower than Intel or MS for C/C++
when generating Intel CPU specific SIMD code with maximum optimisation.

MS compiler still does pretty stupid things like internal compiler
generated SIMD objects of 128, 256 or 512 bits (16, 33 or 64 byte) and
having them crossing a cache line boundary.

Nobody has answered my question. Generalizations about software timing
abound but hard numbers are rare. Programmers don\'t seem to use
oscilloscopes much.
 
On Sat, 14 Jan 2023 17:57:08 GMT, Jan Panteltje
<pNaonStpealmtje@yahoo.com> wrote:

On a sunny day (Sat, 14 Jan 2023 08:31:33 -0800) it happened John Larkin
jlarkin@highlandSNIPMEtechnology.com> wrote in
tol5shtb7chchpkq63hnb1mfsveolk1tib@4ax.com>:

On Sat, 14 Jan 2023 06:27:45 GMT, Jan Panteltje
pNaonStpealmtje@yahoo.com> wrote:

Payed about 100 USD for my Pi4 4 GB and my Pi4 8 GB just 2 years ago December 2020,
including SDcard, RapiOS, plastic housing, cables, cooling fins and supply.

No fan, it does run hot, about 70 C.
But I use that one for web browsing.
The older one with 4 GB memory has an ebay metal housing and a fan.
After lubricating that fan with vaseline it now has run quiet for 4 years?
The metal housing also stops any WiFi, as that one is part of the security
system and no WiFi allowed there.
It runs 24/7 recording 6 cameras, 2 audio channels, weather sensors (temp, air pressure, humidity
airtraffic, ship traffic, radiation etc (from an even older rRaspberry Pi that works as server) ..
http://panteltje.com/panteltje/xgpspc/index.html
Each Pi4 has a 4 TB Toshiba USB harddisk connected to it.




The enclosure is a nightmare so I threw that away. Just run the board.
It doesn\'t seem to need the fan.

Type this in a terminal to see the current temperature:
vcgencmd measure_temp


Fingers are easier.

This is from google:
For Raspberry Pi 3+, a \'soft\' temperature limit of 60°C has been introduced.]
This means that even before reaching the hard limit at 85°C, the clock speed is reduced from 1.4GHz to lower frequencies, reducing the temperatu
and
That is the so-called throttling. The Raspberry Pi monitors the temperature continuously.
Above 82 °C (180 °F), the clock frequency is automatically lowered, regardless of which flag is set. This action will reduce heat

So better use vcgencmd and it saves your finger too from getting fried.

It might get hotter when it\'s compiling or something, but it\'s not
very warm. It would be easy to add the fan if it got necessary. The
kit did come with three stick-on heat sinks.

There are also LCD monitor things that the 4B mounts on the back of.
They have a fan.


I should actually get a better housing with fan for my Pi4 8 GB like I have for my Pi4 4 GB that runs at about 46 Degrees C.
Of course maybe bringing your own fried finger to a restaurant ?? ..Discount?

My finger is calibrated. I can touch 50C forever and 60C for about
half a second. Touching 100C briefly hurts but does no harm. Baking a
real pie is more dangerous.

I\'ve had interns that refused to touch chips to see if they are hot.
They were afraid of being electrocuted by 3.3 volts.
 
Am 14.01.23 um 19:21 schrieb John Larkin:

Nobody has answered my question. Generalizations about software timing
abound but hard numbers are rare. Programmers don\'t seem to use
oscilloscopes much.

I did it on the BeagleBoneBlack. It has an ARM CPU to run
Debian Linux etc and two I/O processors that are 200 MHz RISCs
without pipeline stalls and operating system. The TI C compiler
is on the BBB. I can do I/O with 5ns resolution & rate.
No jitter, and directly from a C program.

volatile int i;
myportbit = 0;
.....
myportbit = 1;
i = 0;
i = 0;
myportbit = 0;

would create a 15 ns wide pulse on myportbit.


Cheers, Gerhard
 
https://forums.raspberrypi.com/viewtopic.php?t=308794#p1848188
 
On 1/14/2023 8:52 AM, Martin Brown wrote:
ISR code is generally very short and best done in assembler if you want it as
quick as possible. Examining the code generation of GCC is worthwhile since it
sucks compared to Intel(better) and MS (best).

I always code ISRs in a HLL -- if only to act as pseudo-code
illustrating what the (ASM) code is actually doing. IME, people
miss details in ASM so having those expressed in a HLL makes
it easier for them to understand the *goal* of the code.

Looking at a .S is a great starting point *if* you have to
hand-tweak the code. Remembering that the code that gets
executed will change as the compiler is revised; ASM won\'t
(which can be A Good Thing as well as A Bad Thing).

In my tests GCC is between 30% and 3x slower than Intel or MS for C/C++ when
generating Intel CPU specific SIMD code with maximum optimisation.

I\'d be less worried about quality of code generator (compiler vs. human ASM)
than the effects of cache, core affinity, *which* bus(es) are called on
for each instruction, other contenders for those resources, etc.

I wrote a MT driver for ISA (1600bpi @ 100ips). Doesn\'t allow much time
to actually talk to the *I/O* with the throughput available on that bus!

Better approaches (barring committing hardware to a task -- boo, hiss!)
are to decouple the time constraint from the code\'s execution. E.g.,
let loops run \"as fast as they can\" and adjust the code to compensate,
dynamically (if there is any VARIABLE latency in an ISR, then you
likely have overlooked this, already!)

MS compiler still does pretty stupid things like internal compiler generated
SIMD objects of 128, 256 or 512 bits (16, 33 or 64 byte) and having them
crossing a cache line boundary.

Advantage: ASM.

But, only if the programmer actually understands the hardware at a level above
the programming model. (programmers are often pretty lousy at understanding
the implications of a hardware design; engineers/coders equally so when trying
to map their knowledge onto an algorithm: \"Why doesn\'t it ALWAYS work?\")
 
On 1/14/2023 11:55 AM, Gerhard Hoffmann wrote:
I did it on the BeagleBoneBlack. It has an ARM CPU to run
Debian Linux etc and two I/O processors that are 200 MHz RISCs
without pipeline stalls and operating system. The TI C compiler
is on the BBB. I can do I/O with 5ns resolution & rate.
No jitter, and directly from a C program.

volatile int i;
myportbit = 0;
....
myportbit = 1;
i = 0;
i = 0;
myportbit = 0;

would create a 15 ns wide pulse on myportbit.

At the very least, you want to annotate this to indicate the
\"i\" assignments are being used SOLELY for their side-effects
(else someone may erroneously remove one -- OR BOTH -- of
them in a misguided attempt to \"improve\" the code; or, change
the type of i, etc.).

Likewise, making the ordering of the myportbit assignments
more explicit -- to ensure they aren\'t reordered or
removed (by an overzealous maintainer).

Of course, if no one ever sees your code but you (and, you
have a perfect memory), ...
 
On Sat, 14 Jan 2023 12:20:07 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:

On 1/14/2023 8:52 AM, Martin Brown wrote:
ISR code is generally very short and best done in assembler if you want it as
quick as possible. Examining the code generation of GCC is worthwhile since it
sucks compared to Intel(better) and MS (best).

I always code ISRs in a HLL -- if only to act as pseudo-code
illustrating what the (ASM) code is actually doing. IME, people
miss details in ASM so having those expressed in a HLL makes
it easier for them to understand the *goal* of the code.

Looking at a .S is a great starting point *if* you have to
hand-tweak the code. Remembering that the code that gets
executed will change as the compiler is revised; ASM won\'t
(which can be A Good Thing as well as A Bad Thing).

In my tests GCC is between 30% and 3x slower than Intel or MS for C/C++ when
generating Intel CPU specific SIMD code with maximum optimisation.

I\'d be less worried about quality of code generator (compiler vs. human ASM)
than the effects of cache, core affinity, *which* bus(es) are called on
for each instruction, other contenders for those resources, etc.

The Pi Pico executes code out of the 2 Mbyte SPI flash, with a 16
Kbyte cache. Cache misses will be *very* slow. So code will need to be
very tight bare-metal. The entire ISR should fit in cache.

When that gets dicey, we\'ll have to add an FPGA.



I wrote a MT driver for ISA (1600bpi @ 100ips). Doesn\'t allow much time
to actually talk to the *I/O* with the throughput available on that bus!

Better approaches (barring committing hardware to a task -- boo, hiss!)
are to decouple the time constraint from the code\'s execution. E.g.,
let loops run \"as fast as they can\" and adjust the code to compensate,
dynamically (if there is any VARIABLE latency in an ISR, then you
likely have overlooked this, already!)

Control loops need to run at a constant rate, with a modest amount of
jitter maybe.



MS compiler still does pretty stupid things like internal compiler generated
SIMD objects of 128, 256 or 512 bits (16, 33 or 64 byte) and having them
crossing a cache line boundary.

Advantage: ASM.

But, only if the programmer actually understands the hardware at a level above
the programming model. (programmers are often pretty lousy at understanding
the implications of a hardware design; engineers/coders equally so when trying
to map their knowledge onto an algorithm: \"Why doesn\'t it ALWAYS work?\")
 
On Sat, 14 Jan 2023 11:07:11 -0800 (PST), Lasse Langwadt Christensen
<langwadt@fonz.dk> wrote:

>https://forums.raspberrypi.com/viewtopic.php?t=308794#p1848188

If I understand that, a floating add would take about 500 ns with a
133 MHz clock. That\'s not as bad as software float, but I wouldn\'t be
able to do much fp math in a 100 KHz irq.

So, scaled integers or FPGA.
 
lørdag den 14. januar 2023 kl. 21.09.04 UTC+1 skrev John Larkin:
On Sat, 14 Jan 2023 12:20:07 -0700, Don Y
blocked...@foo.invalid> wrote:

On 1/14/2023 8:52 AM, Martin Brown wrote:
ISR code is generally very short and best done in assembler if you want it as
quick as possible. Examining the code generation of GCC is worthwhile since it
sucks compared to Intel(better) and MS (best).

I always code ISRs in a HLL -- if only to act as pseudo-code
illustrating what the (ASM) code is actually doing. IME, people
miss details in ASM so having those expressed in a HLL makes
it easier for them to understand the *goal* of the code.

Looking at a .S is a great starting point *if* you have to
hand-tweak the code. Remembering that the code that gets
executed will change as the compiler is revised; ASM won\'t
(which can be A Good Thing as well as A Bad Thing).

In my tests GCC is between 30% and 3x slower than Intel or MS for C/C++ when
generating Intel CPU specific SIMD code with maximum optimisation.

I\'d be less worried about quality of code generator (compiler vs. human ASM)
than the effects of cache, core affinity, *which* bus(es) are called on
for each instruction, other contenders for those resources, etc.
The Pi Pico executes code out of the 2 Mbyte SPI flash, with a 16
Kbyte cache. Cache misses will be *very* slow. So code will need to be
very tight bare-metal. The entire ISR should fit in cache.

you can copy some (or all) of the code to ram instead of using execute-in-place from flash

I think you can even turn off the cache to get an additional 16k ram
 
lørdag den 14. januar 2023 kl. 21.15.02 UTC+1 skrev John Larkin:
On Sat, 14 Jan 2023 11:07:11 -0800 (PST), Lasse Langwadt Christensen
lang...@fonz.dk> wrote:

https://forums.raspberrypi.com/viewtopic.php?t=308794#p1848188

If I understand that, a floating add would take about 500 ns with a
133 MHz clock. That\'s not as bad as software float, but I wouldn\'t be
able to do much fp math in a 100 KHz irq.

So, scaled integers or FPGA.

or a different MCU with an FPU, \"blackpills\" are a similar formfactor and has a cortex-M4
https://www.amazon.com/Alinan-STM32F401CCU6-Development-MicroPython-Programming/dp/B0B96YMQQP/
 
On Sat, 14 Jan 2023 12:20:24 -0800 (PST), Lasse Langwadt Christensen
<langwadt@fonz.dk> wrote:

lørdag den 14. januar 2023 kl. 21.09.04 UTC+1 skrev John Larkin:
On Sat, 14 Jan 2023 12:20:07 -0700, Don Y
blocked...@foo.invalid> wrote:

On 1/14/2023 8:52 AM, Martin Brown wrote:
ISR code is generally very short and best done in assembler if you want it as
quick as possible. Examining the code generation of GCC is worthwhile since it
sucks compared to Intel(better) and MS (best).

I always code ISRs in a HLL -- if only to act as pseudo-code
illustrating what the (ASM) code is actually doing. IME, people
miss details in ASM so having those expressed in a HLL makes
it easier for them to understand the *goal* of the code.

Looking at a .S is a great starting point *if* you have to
hand-tweak the code. Remembering that the code that gets
executed will change as the compiler is revised; ASM won\'t
(which can be A Good Thing as well as A Bad Thing).

In my tests GCC is between 30% and 3x slower than Intel or MS for C/C++ when
generating Intel CPU specific SIMD code with maximum optimisation.

I\'d be less worried about quality of code generator (compiler vs. human ASM)
than the effects of cache, core affinity, *which* bus(es) are called on
for each instruction, other contenders for those resources, etc.
The Pi Pico executes code out of the 2 Mbyte SPI flash, with a 16
Kbyte cache. Cache misses will be *very* slow. So code will need to be
very tight bare-metal. The entire ISR should fit in cache.

you can copy some (or all) of the code to ram instead of using execute-in-place from flash

That\'s a good idea. A typical ISR could be pretty small, and let the
mainline program thrash all it likes.

I think you can even turn off the cache to get an additional 16k ram

Yikes, execute out of SPI flash?
 
On Sat, 14 Jan 2023 12:27:03 -0800 (PST), Lasse Langwadt Christensen
<langwadt@fonz.dk> wrote:

lørdag den 14. januar 2023 kl. 21.15.02 UTC+1 skrev John Larkin:
On Sat, 14 Jan 2023 11:07:11 -0800 (PST), Lasse Langwadt Christensen
lang...@fonz.dk> wrote:

https://forums.raspberrypi.com/viewtopic.php?t=308794#p1848188

If I understand that, a floating add would take about 500 ns with a
133 MHz clock. That\'s not as bad as software float, but I wouldn\'t be
able to do much fp math in a 100 KHz irq.

So, scaled integers or FPGA.

or a different MCU with an FPU, \"blackpills\" are a similar formfactor and has a cortex-M4
https://www.amazon.com/Alinan-STM32F401CCU6-Development-MicroPython-Programming/dp/B0B96YMQQP/

We use STM32F207IGT6 on some existing products, but they are hard to
get hence expensive. The Pi Pico for $4 is very appealing.
 

Welcome to EDABoard.com

Sponsor

Back
Top