Unexpected realtime delay on task 0 with period 1000000
@PCW who would be the person group best suited to fixing it, and do they know about it? is this a linuxcnc thing or a debian thing?
Honestly its a Linux thing (broken driver) but would be easier to fix in a non debian compliant ISO
Ultimately it should get fixed in the kernel driver so the proprietary Realtek driver is not needed.
Interesting, would it help the intel adapter performance as well? bring it back to the 4.9 kernel day performance? i miss those days...lol
I don't think this is a kernel bug but related to some decisions made with the kernel inclusions for realtek NIC drivers. The R8169 driver is a generic NIC driver for the Realtek cards but it fails in a Real Time environment on the R8168/R8111 hardware (And also the Realtek R8125 driver). I suspect it is related to restrictions on the code licencing of Realtek binaries and FOSS.
Pre kernel 5.10 (Bullseye), you had to download the driver sources from Realtek and build your own driver but this was kernel specific. Users of the Odroid H2+ will remember having to compile the R8125 driver. It was hard to find a working version for your kernel. Somewhere around kernel 5.9, I think the Realtek code was separated into different drivers and the R8169 was included in the kernel. But the non-FOSS code from Realtek was separated out into DKMS driver modules. DKMS allows kernel modules to be distributed in source code and built for a specific kernel automatically. This is recompiled if you upgrade your kernel.
So here is the problem. the R8168_dkms driver is not part of the kernel so its not considered a kernel bug. the RT project does not see it as part of their remit as a driver is available.
I have tried reporting Bugs with Debian and the RT kernel team but it is exhausting and non productive so I gave up.
It does get a bit worse as because the R8168_dkms driver not a kernel module, the dkms driver is not included in the Debian non-free-firmware module so your sources.list specifically still requires the non-free module. (We add this for you in the Linuxcnc ISO).
Please Log in or Create an account to join the conversation.
that fails in RT environments. Unfortunately it looks like very little if any RT work has been
done on network drivers (looking at the RT patches)
It looks like the main driver related RT patches are for video drivers...
Please Log in or Create an account to join the conversation.
Please Log in or Create an account to join the conversation.
Agreed aboout the FOSS driver which was what I said in a round about way.I would call it a Preempt-rt kernel bug since the r8169 driver is a FOSS kernel driver
that fails in RT environments. Unfortunately it looks like very little if any RT work has been
done on network drivers (looking at the RT patches)
There are no Realtime network drivers because network communications is not considered to be real time, so its not really a RT issue. That's why the CPU affinity becomes important as it keeps the non RT NIC driver on the same isolated kernel as the servo thread.
You need to write your own NIC driver. This has been done by the ethercat guys but I think they have mostly dropped them from their packages in favour of their generic driver which they acknowlwdge is not real time but is adequate for most Ethercat projects. If you compile Ethercat from source, you can use some Realtime NIC drivers but not sure how current they are.
@Lcvette, I have raised issues with both Debian and the RT kernel team and nobody really cares. If you want changes, you would need to do it yourself
Please Log in or Create an account to join the conversation.
It's not a issue with Intel MACs
If you disable IRQ coalescing
and pin IRQs, You get pretty decent
performance.
For example on this HP8300 I have been running a test on a 7I76EU
with a 3 KHz servo thread for about 3 days while using the PC for general
web browsing, email etc and have not had any issues.
(CPU frequency is 2.9 GHz and times are in CPU clocks)
I am getting latency with an intel adapter on just about every image release after 6.1.0-13 for some reason. so it seems something got worse after that point in time. I can repeatably return to the above image and no more delays. i do check whenever a new release comes out to see if that changes but so far it does not. im not certain what the difference might be but it seems maybe it could be identified with a narrowed down image update timeline. I tried searching
not sure if that is helpful, 6.1.0-13 is no where near as good as the 4.9 stretch version was latency wise but it at least did not throw warnings.
I tried searching for a repository of the kernel to see if i could dig through and see what changes were made but its very difficult, it looks like kernel.org was the most informative but i still could not find an actual repository with a history tree to hunt through. anyone that could direct me to the spot to start digging?
Chris
Please Log in or Create an account to join the conversation.
I tried searching for a repository of the kernel to see if i could dig through and see what changes were made but its very difficult, it looks like kernel.org was the most informative but i still could not find an actual repository with a history tree to hunt through. anyone that could direct me to the spot to start digging?
Have you tried importing the rt git project into a git client (eg gitKraken) and then compare 2 branches?
git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git
Please Log in or Create an account to join the conversation.
This script can be used for testing:
chmod +x pinirq.txt
then
./pinirq.txt [ethernet_device_name]
My home desktop (a HP EliteDesk G1) needs this
its running 6.11.0-rc3-rt3
Note that Debian RT kernels tend to be terrible
Attachments:
Please Log in or Create an account to join the conversation.
Please Log in or Create an account to join the conversation.
(note the IRQ pinning needs isolcpus set in the kernel command line)
Please Log in or Create an account to join the conversation.