Unexpected realtime delay on task 0 with period 1000000

More
27 Oct 2024 02:06 #313279 by rodw

@PCW who would be the person group best suited to fixing it, and do they know about it? is this a linuxcnc thing or a debian thing?
 
Honestly its a Linux thing (broken driver) but would be easier to fix in a non debian compliant ISO
Ultimately it should get fixed in the kernel driver so the proprietary Realtek driver is not needed.

Interesting, would it help the intel adapter performance as well? bring it back to the 4.9 kernel day performance?  i miss those days...lol


I don't think this is a kernel bug but related to some decisions made with the kernel inclusions for realtek NIC drivers. The R8169 driver is a generic NIC driver for the Realtek cards but it fails in a Real Time environment on the R8168/R8111 hardware (And also the Realtek R8125 driver). I suspect it is related to restrictions on the code licencing of Realtek binaries  and FOSS.

Pre kernel 5.10 (Bullseye), you had to download the driver sources from Realtek and build your own driver but this was kernel specific. Users of the Odroid H2+ will remember having to compile the R8125 driver. It was hard to find a working version for your kernel. Somewhere around kernel 5.9, I think the  Realtek code was separated into different drivers and the R8169 was included in the kernel. But the non-FOSS code from Realtek was separated out into DKMS driver modules. DKMS allows kernel modules to be distributed in source code and built for a specific kernel automatically. This  is recompiled if you upgrade your kernel.

So here is the problem. the R8168_dkms driver is not part of the kernel so its not considered a kernel bug. the RT project does not see it as part of their remit as a driver is available.

I have tried reporting Bugs with Debian and the RT kernel team but it is exhausting and non productive so I gave up.

It does get a bit worse as because the R8168_dkms driver not a kernel module, the dkms driver is not included in the Debian non-free-firmware module so your sources.list specifically still requires the non-free module. (We add this for you in the Linuxcnc ISO).

Please Log in or Create an account to join the conversation.

More
27 Oct 2024 02:49 - 27 Oct 2024 02:52 #313283 by PCW
I would call it a Preempt-rt kernel bug since the r8169 driver is a FOSS kernel driver
that fails in RT environments. Unfortunately it looks like very little if any RT work has been
done on network drivers (looking at the RT patches)

It looks like the main driver related RT patches are for video drivers...
Last edit: 27 Oct 2024 02:52 by PCW.

Please Log in or Create an account to join the conversation.

More
27 Oct 2024 12:14 #313293 by Lcvette
seems like it could be fixed then if its a known issue. sure would save some folks the headache. i wonder if the issue is known if there has been little to no work to resolve it. Is there anyone in the linuxcnc community that contributes to kernel work in this area?

Please Log in or Create an account to join the conversation.

More
27 Oct 2024 19:32 #313327 by rodw

I would call it a Preempt-rt kernel bug since the r8169 driver is a FOSS kernel driver
that fails in RT environments. Unfortunately it looks like very little if any RT work has been
done on network drivers (looking at the RT patches)
 

Agreed aboout the FOSS driver which was what I said in a round about way.
There are no Realtime network drivers because network communications is not considered to be real time, so its not really a RT issue. That's why the CPU affinity becomes important as it keeps the non RT NIC driver on the same isolated kernel as the servo thread.

You  need to write your own NIC driver. This has been done by the ethercat guys but I think they have mostly dropped them from their packages in favour of their generic driver which they acknowlwdge is not real time but is adequate for most Ethercat projects. If you compile Ethercat from source, you can use some Realtime NIC drivers but not sure how current they are.

@Lcvette, I have raised issues with both Debian and the RT kernel team and nobody really cares. If you want changes, you would need to do it yourself

Please Log in or Create an account to join the conversation.

More
04 Nov 2024 15:58 #313762 by Lcvette

It's not a issue with Intel MACs
If you disable IRQ coalescing
and pin IRQs, You get pretty decent
performance.

For example on this HP8300 I have been running a test on a 7I76EU
with a 3 KHz servo thread for about 3 days while using the PC for general
web browsing, email etc and have not had any issues.
(CPU frequency is 2.9 GHz and times are in CPU clocks)


 

 



I am getting latency with an intel adapter on just about every image release after 6.1.0-13 for some reason. so it seems something got worse after that point in time.  I can repeatably return to the above image and no more delays.  i do check whenever a new release comes out to see if that changes but so far it does not.  im not certain what the difference might be but it seems maybe it could be identified with a narrowed down image update timeline.  I tried searching

not sure if that is helpful, 6.1.0-13 is no where near as good as the 4.9 stretch version was latency wise but it at least did not throw warnings.  

I tried searching for a repository of the kernel to see if i could dig through and see what changes were made but its very difficult, it looks like kernel.org was the most informative but i still could not find an actual repository with a history tree to hunt through.  anyone that could direct me to the spot to start digging?

Chris

Please Log in or Create an account to join the conversation.

More
04 Nov 2024 18:21 - 04 Nov 2024 18:22 #313767 by Aciera


I tried searching for a repository of the kernel to see if i could dig through and see what changes were made but its very difficult, it looks like kernel.org was the most informative but i still could not find an actual repository with a history tree to hunt through.  anyone that could direct me to the spot to start digging?


Have you tried importing the rt git project into a git client (eg gitKraken) and then compare 2 branches?
git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git
Last edit: 04 Nov 2024 18:22 by Aciera.

Please Log in or Create an account to join the conversation.

More
04 Nov 2024 18:23 #313768 by PCW
Did you try pinning the IRQ?

This script can be used for testing:

 

File Attachment:

File Name: pinirq_2024-11-04.txt
File Size:1 KB


chmod +x pinirq.txt

then

./pinirq.txt [ethernet_device_name]

My home desktop (a HP EliteDesk G1) needs this
its running  6.11.0-rc3-rt3

Note that Debian RT kernels tend to be terrible
 
Attachments:
The following user(s) said Thank You: rodw

Please Log in or Create an account to join the conversation.

More
05 Nov 2024 15:30 #313821 by Lcvette
is that doing the same thing as the irq coalesce blurb in the network manager setup?

Please Log in or Create an account to join the conversation.

More
05 Nov 2024 16:50 #313828 by PCW
No, IRQ pinning is different. Both are needed for best latency.
(note the IRQ pinning needs isolcpus set in the kernel command line)

Please Log in or Create an account to join the conversation.

Moderators: KCJLcvette
Time to create page: 0.341 seconds
Powered by Kunena Forum