Latency, error finishing read, and IRQ affinity
12 May 2024 16:20 #300450
by Swag
Replied by Swag on topic Latency, error finishing read, and IRQ affinity
I thought I had, I edited the section with the IP address to my card IP address of 10.10.10.11. Obviously I’ve made a mistake, I’ll go back through it again to see.
Please Log in or Create an account to join the conversation.
12 May 2024 16:29 #300452
by PCW
Replied by PCW on topic Latency, error finishing read, and IRQ affinity
Yeah, A bit of error/status reporting would improve that script
Please Log in or Create an account to join the conversation.
12 May 2024 16:39 #300456
by Swag
Replied by Swag on topic Latency, error finishing read, and IRQ affinity
I think I see my mistake, I only changed the second instance (line 145) to 10.10.10.11 and overlooked line 117. Sorry for the trouble and thanks for the help!
I knew it was something I had missed. Odd that the old debian 10 vanilla install just worked, but the debian 12 was so finicky about having the cpu isolated and irq pinned.
I knew it was something I had missed. Odd that the old debian 10 vanilla install just worked, but the debian 12 was so finicky about having the cpu isolated and irq pinned.
Please Log in or Create an account to join the conversation.
12 May 2024 16:54 #300458
by PCW
Replied by PCW on topic Latency, error finishing read, and IRQ affinity
Yes, I think some kernel change made Ethernet latency more sensitive to IRQ pinning
Please Log in or Create an account to join the conversation.
12 May 2024 18:17 #300464
by Swag
Replied by Swag on topic Latency, error finishing read, and IRQ affinity
If I rerun rt_setup.sh, it is not updating the two files in /etc/linuxcnc and manually editing the lcnc_setirqaffinities.sh to have the correct IP does not result in the IRQ being pinned to the correct CPU (2 instead of 8, still). So looks like I still have some investigations to do, but at least I now know what needs to be accomplished.
The following user(s) said Thank You: DHeineck
Please Log in or Create an account to join the conversation.
30 May 2024 02:23 #301800
by BryceJ
Replied by BryceJ on topic Latency, error finishing read, and IRQ affinity
Thanks for the script! Had been fighting this issue for a few days with errors talking to a mesa board on a N100 that I was setting up. Worked great on once I fixed the same issue Swag had. My static IP had been 10.10.10.11. Changed it to 10.10.10.1 and the ping to 10.10.10.10 was good, previously I was getting random 3ms spikes.
ping 10.10.10.10 -i .02
...
325 packets transmitted, 325 received, 0% packet loss, time 8151ms
rtt min/avg/max/mdev = 0.080/0.101/0.127/0.007 ms
ping 10.10.10.10 -i .02
...
325 packets transmitted, 325 received, 0% packet loss, time 8151ms
rtt min/avg/max/mdev = 0.080/0.101/0.127/0.007 ms
The following user(s) said Thank You: tommylight
Please Log in or Create an account to join the conversation.
- CNC_Tinkerer
- Offline
- New Member
Less
More
- Posts: 4
- Thank you received: 2
07 Aug 2024 14:27 - 07 Aug 2024 14:33 #307179
by CNC_Tinkerer
Replied by CNC_Tinkerer on topic Latency, error finishing read, and IRQ affinity
Hi!
I am working with a custom ethernet interface and think I really need this IRQ management. Unfortunately I have not been able to get what's described here to work after quite a few hours beating my head on the desk and rereading ALL these posts several times. I've tried a number of things but I don't know much about scripting. I'm probably missing something obvious.
If someone has debugging suggestions or can tell me what other information to post here I would really appreciate it.
The obvious things:
I did install irqbalance via Synaptic. I did change IP to 10.0.0.1 two places in rt_setup.sh--that's the address in interfaces file.
What happens:
If I run rt_setup.sh it says it is creating the affinity file, and it ends up in /etc/linuxcnc as expected with permissions correct. But after restart I seem to end up with ethernet interrupts always on cpu5 where I would expect cpu7.
The script comes up with isolcpus 6,7 in grub, which is what I determined months ago and added manually and seems like it should be correct.
my /etc/network/interfaces (after removing commented out lines and loopback):
Significant things I tried:
Running test script "pinirq_2024-05-12.txt" with parameter enp8s0f0 displays expected results. But watching interrupts stills shows the nic IRQs on cpu5.
I tried using NIC=`enp8s0f0` in place of both awks in rt_setup.sh but that does not work either (note the quotes I used!).
I did have loopback lines and old (commented out) iface and address lines in interfaces file. I removed them and reran everything as I'm not sure how picky that grep is.
section from /proc/interrupts:
I am working with a custom ethernet interface and think I really need this IRQ management. Unfortunately I have not been able to get what's described here to work after quite a few hours beating my head on the desk and rereading ALL these posts several times. I've tried a number of things but I don't know much about scripting. I'm probably missing something obvious.
If someone has debugging suggestions or can tell me what other information to post here I would really appreciate it.
Architecture: x86_64
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 17
Model name: AMD Ryzen 5 PRO 2400G with Radeon Vega Graphics
The obvious things:
I did install irqbalance via Synaptic. I did change IP to 10.0.0.1 two places in rt_setup.sh--that's the address in interfaces file.
What happens:
If I run rt_setup.sh it says it is creating the affinity file, and it ends up in /etc/linuxcnc as expected with permissions correct. But after restart I seem to end up with ethernet interrupts always on cpu5 where I would expect cpu7.
user1@debian2:~/linuxcnc/irq-mgmt$ sudo ./rt_setup.sh
[sudo] password for user1:
Linux debian2 4.19.0-16-rt-amd64 #1 SMP PREEMPT RT Debian 4.19.181-1 (2021-03-19) x86_64
GNU/Linux irqbalance is running AND there are more than 2 cpus - setting up policy script
The script comes up with isolcpus 6,7 in grub, which is what I determined months ago and added manually and seems like it should be correct.
linux /boot/vmlinuz-4.19.0-16-rt-amd64 root=UUID=5b220cf2-b101-424f-9abb-ee2246a65503
ro initrd=/install/gtk/initrd.gz quiet
isolcpus=6,7 intel_idle.max_cstate=1 i915.enable_rc6=0
my /etc/network/interfaces (after removing commented out lines and loopback):
# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5). source /etc/network/interfaces.d/*
# NEW - Lenovo Ryzen desktop PC
allow-hotplug enp8s0f0
iface enp8s0f0 inet static
address 10.0.0.1
Significant things I tried:
Running test script "pinirq_2024-05-12.txt" with parameter enp8s0f0 displays expected results. But watching interrupts stills shows the nic IRQs on cpu5.
user1@debian2:~/linuxcnc/irq-mgmt$ sudo ./pinirq enp8s0f0
Cores: 8 Old CPU Mask: 0020 Set device enp8s0f0 IRQ 61 CPU mask to 0128
I tried using NIC=`enp8s0f0` in place of both awks in rt_setup.sh but that does not work either (note the quotes I used!).
I did have loopback lines and old (commented out) iface and address lines in interfaces file. I removed them and reran everything as I'm not sure how picky that grep is.
user1@debian2:~/linuxcnc/irq-mgmt$ sudo /etc/init.d/irqbalance status -l
● irqbalance.service - irqbalance daemon
Loaded: loaded (/lib/systemd/system/irqbalance.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2024-08-07 08:31:44 EDT; 1h 8min ago
Docs: man:irqbalance(1)
https://github.com/Irqbalance/irqbalance
Main PID: 577 (irqbalance)
Tasks: 2 (limit: 4915)
Memory: 6.3M
CGroup: /system.slice/irqbalance.service
└─577 /usr/sbin/irqbalance --foreground --policyscript=/etc/linuxcnc/lcnc_irqbalancepolicy.sh
Aug 07 08:31:44 debian2 systemd[1]: Started irqbalance daemon.
section from /proc/interrupts:
60: 0 0 0 0 0 0 0 0 IR-PCI-MSI 4726791-edge xhci_hcd
61: 0 0 0 0 0 71062 0 0 IR-PCI-MSI 4194304-edge enp8s0f0
62: 0 0 0 697991 0 0 0 0 IR-PCI-MSI 4718592-edge amdgpu
Last edit: 07 Aug 2024 14:33 by CNC_Tinkerer. Reason: code formatting still butchered
Please Log in or Create an account to join the conversation.
- CNC_Tinkerer
- Offline
- New Member
Less
More
- Posts: 4
- Thank you received: 2
07 Aug 2024 14:46 #307182
by CNC_Tinkerer
Replied by CNC_Tinkerer on topic Latency, error finishing read, and IRQ affinity
Is it possible /etc/default/irqbalance is not being used and thus not reaching our lcnc_irqbalancepolicy.sh file? I don't understand how that process is supposed to work...
Please Log in or Create an account to join the conversation.
Time to create page: 0.147 seconds