Unexpected realtime delay on task 0 on a HP8300

More
10 Mar 2024 08:56 #295562 by rodw
Swap your NICs around so you use the intel NIC for the Mesa connection. Then you should be good.

Please Log in or Create an account to join the conversation.

More
10 Mar 2024 22:54 #295603 by lrak
@rodw
Don't think that is the problem. Could be it is fixed. I've had another day without a problem.  Will let it run a week or so.

Could be I've fixed it by uncommenting the hardware−irq−coalesce−rx−usecs 0  line.  (Can't remember why I commented it out before?)  I've also turned turbo off. 

,.,.
OT:
(not sure how turbo works?  It is described by Intel as:
"Intel® Turbo Boost Technology 2.0 accelerates processor and graphics performance for peak loads, automatically allowing processor cores to run faster than the rated operating frequency if they’re operating below power, current, and temperature specification limits. Whether the processor enters into Intel® Turbo Boost Technology 2.0 and the amount of time the processor spends in that state depends on the workload and operating environment."

What it doesn't say is if it uses the ME processor (this is a separate processor in the same chip) to check current/temperature in a way that preempts. I've had AMT off in the BIOS (ME(Management Engine)) - but it seems possible that turbo uses the management engine processor? - which would be bad. (Management engine work preempts other tasks). 

What is clear is that not only linuxCNC - but video/audio production, VoIP, etc need uninterrupted processor support - there was a time when Intel produced industrial mother boards with this in mind.  What changed is 'secure-boot' (which is the OPPOSITE of what it sounds like) seems to have been mandated by three letter folks without public knowledge or consent.

It would be nice to find a platform that supports core-boot on modern hardware - clean of binary blobs.
coreboot.org/status/board-status.html

Beyond coreboot - it is important to understand that modern systems just don't execute instruction in knowable clocks - even with simple ARM core chips - you can't trust that a flash read will happen in the same time twice. Same with setting up PCI-e transfers.   This shouldn't matter in a system (like MESA) that uses hardware counters - but breaks software loop timing of I/O.

So the actual processing power to run LinuxCNC is quite modest - it is the preemption latency problems caused by the ME and binary blobs that cause problems.




 

Please Log in or Create an account to join the conversation.

More
10 Mar 2024 23:38 - 10 Mar 2024 23:39 #295606 by PCW
Setting IRQ coalescing only applies to the motherboard Intel MAC
which I assumed you were using, If you are using a Realtek MAC
it makes sense to comment out the IRQ line in the interfaces file
since the Realtek driver does not support that feature.

The problem with Turbo modes  or any mode that dynamically
changes the CPU clock speed is that the processor is halted while
the PLL generated clock frequency stabilizes and this can be
multiple milliseconds
Last edit: 10 Mar 2024 23:39 by PCW.

Please Log in or Create an account to join the conversation.

More
11 Mar 2024 00:07 #295607 by tommylight
I think it is about time to open the PC and yank out the CPU cooler, clean, repaste, mount, test.

Please Log in or Create an account to join the conversation.

More
11 Mar 2024 00:51 #295610 by rodw
Debian bookworm has excllent support for Secure boot.
However, the "secure" side kicks in if you need to install kernel modules as it is seen you are making an unauthorised change to the kernel code.. This applies if you need to install the Realtek R8168-dkms driver or ethercat.. There is a process to enrol the dkms application in the secure boot  system so it all works nicely.

I just installed debian 12.5 on an office machine this morning that needs secure boot enabled. The installer worked fine.
The following user(s) said Thank You: lrak

Please Log in or Create an account to join the conversation.

More
12 Mar 2024 03:31 #295711 by lrak
@PCW & rodw
Error popped up again last night. <grump>

OK - I think this might be something about the realtek driver ?
Purging r8168-dkms  - reinstalling kernel (there is a bug in r8168 so you need to do this)
I get a different driver that I'm testing with now.

modinfo r8169 |grep verm
vermagic:       6.1.0-18-rt-amd64 SMP preempt_rt mod_unload modversions


So the current version in Debian is not the same:
# modinfo r8168 |grep verm
vermagic:       6.1.0-18-rt-amd64 SMP preempt_rt mod_unload modversions

.. Checking - updated pci.ids

I think this is was changed - The old version might have had this misidentified?

 8161  RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
               10ec 8168  TP-Link TG-3468 v4.0 Gigabit PCI Express Network Adapter

But - the kernel is using r8169?  Not sure why - seems to work - ran some gcode.  

There was a reason that I used the Intel for the LAN - don't remember why now.  If this is still acting strange I will dig up an Intel daughter board.

,.,.,.,.,.,.,.
@rodw
- Re 'secure boot'  - long story - but 'secure boot' as used by Debian appears OK - but the ME + secure-boot is the opposite of secure - the name is quite misleading.   Some hints here:

wiki.gentoo.org/wiki/User:Sakaki/Sakaki%...el_Management_Engine






 

Please Log in or Create an account to join the conversation.

More
12 Mar 2024 03:36 #295714 by cornholio
The following user(s) said Thank You: lrak

Please Log in or Create an account to join the conversation.

More
12 Mar 2024 07:09 #295720 by rodw
Typically, the linux kernel supports the r8169 driver. But this driver is suboptimal and led to ethernet realtime delays.
Installing the r8168-dkms driver is one part of the solution. But the remaining issue is the Debian real time kernel is slow when compared with a kernel compiled from pristine kernel sources.

The simplest solution as sugested before is to use your Intel NIC for communicating with the Mesa card. Test this with the correct coalescing settings as per man hm2_eth. Then look at the energy efficient ethernet settings, followed by the CPU affinity settings discussed in some other threads on this forum.

Please Log in or Create an account to join the conversation.

More
13 Mar 2024 02:14 #295800 by lrak
@rodw  & @cornholio

24 hours with out event with kernel driver
[color=#000000]# ethtool --show-eee enp2s0 [/color]
EEE settings for enp2s0:
       EEE status: enabled - inactive
       Tx LPI: disabled
       Supported EEE link modes:  100baseT/Full
                                  1000baseT/Full
       Advertised EEE link modes:  100baseT/Full
                                   1000baseT/Full
       Link partner advertised EEE link modes:  Not reported

So I reinstalled the r8168-dkms module - Put
options r8168 eee_enable=0
options r8168 aspm=0
in modeprobe.d/r8168-dkms.conf


Only thing left - I see wake-on turned on??
Supports Wake-on: pumbg
 Wake-on: g

Please Log in or Create an account to join the conversation.

More
28 Mar 2024 04:24 - 28 Mar 2024 04:25 #296967 by lrak
Well problem is still here - changed out the realtek NIC  with an intel

Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)

Uses the igb
module.

I don't think it is the network card - How can I get a list of running tasks when the error gets fired off?
and/or the time?





 
Last edit: 28 Mar 2024 04:25 by lrak.

Please Log in or Create an account to join the conversation.

Time to create page: 0.241 seconds
Powered by Kunena Forum