"error finishing read" with Mesa 7i92T on fresh install

More
15 Jul 2024 14:23 - 15 Jul 2024 14:24 #305233 by shasse
I'm experiencing the following on a fresh install of LinuxCNC from the iso (2.9.3):
hm2/hm2_7i92.0: error finishing read!
iter=15426

this happens intermittently, but typically within a few seconds of jogging the machine around after enabling it. No motors are hooked up nor anything attached to the 7i92T other than +5V power. If I enable the machine but don't move any axes it will eventually happen after some number of minutes. I'm using a generic mini PC and have tried two identical PCs with a fresh LinuxCNC install and have the same problem on each. I've also switched ethernet cords with the same result.

I've read through this thread forum.linuxcnc.org/27-driver-boards/4691...error-finishing-read and nothing really conclusive came out of it. I do also get a real time error on LinuxCNC startup or shortly after, but doing latency tests in pncconf shows fairly good numbers. I don't have the specifics in front of me but I can get them.

 I have not tried switching the +5V supply but that is my next step.

Any thoughts on how I can further diagnose this?

Thanks!

Scott
Last edit: 15 Jul 2024 14:24 by shasse. Reason: fix typo

Please Log in or Create an account to join the conversation.

More
15 Jul 2024 14:34 #305234 by PCW
Can you run this command:

sudo chrt 99 ping -i .001 -q 10.10.10.10
 hit control C
then run it again
sudo chrt 99 ping -i .001 -q 10.10.10.10
wait about a minute and hit control C again
 and copy paste the result here
(Those commands assume you have the 7I92T IP address set for
10.10.10.10,  if not, use 192.168.1.121 in the ping commands)

Also run
lshw -class network
to determine what Ethernet hardware you have
 
The following user(s) said Thank You: shasse

Please Log in or Create an account to join the conversation.

More
15 Jul 2024 22:06 #305283 by rodw
The following user(s) said Thank You: shasse

Please Log in or Create an account to join the conversation.

More
17 Jul 2024 01:31 #305440 by shasse
Sorry I have not been able to get back to these diagnostics. I tried another type of mini PC with the same failed results. I tried the driver update and kernel parameters from Rod's docs (all the mini pcs were RTL8111 ethernet chipsets), then I used a regular fairly old desktop which ran for much longer but eventually encountered the same issue.

I'm going to order another 7i92T to see if there is potentially a problem with the card itself,

I did at one point flash this with the Path Pilot 7i92T firmware as a part of a previous experiment running Path Pilot on the mini PC, but flashed it to the 7i92t_g540d firmware before all of this. I don't know if anything can persist from the Path Pilot flash process.

Thanks,

Scott
The following user(s) said Thank You: rodw

Please Log in or Create an account to join the conversation.

More
17 Jul 2024 04:31 #305448 by PCW
Its very unlikely to be a card issue.

Try the ping command in the previous post to see what the Ethernet latency is.

You also have to disable basically all power management/speed shifting/
turbo modes etc in the PC BIOS

Finally, using isolcpus and pinning the Ethernet to the last CPU can help a lot

If you have isolcpus setup in the kernel command line, you can temporarily
pin the Ethernet IRQ with this script:

 

File Attachment:

File Name: pinirq_2024-07-17.txt
File Size:1 KB


and check that its done the expected thing with this script:

 

File Attachment:

File Name: checkmask_...7-17.txt
File Size:1 KB







 
Attachments:
The following user(s) said Thank You: shasse, rodw

Please Log in or Create an account to join the conversation.

More
19 Jul 2024 14:11 #305651 by shasse
Here are the ping statistics without driver/kernel/isolcpu changes:

root@router:~# sudo chrt 99 ping -i .001 -q 10.10.10.10
PING 10.10.10.10 (10.10.10.10) 56(84) bytes of data.
c^C
--- 10.10.10.10 ping statistics ---
3449 packets transmitted, 3448 received, 0.0289939% packet loss, time 3458ms
rtt min/avg/max/mdev = 0.059/0.110/1.041/0.047 ms
root@router:~# sudo chrt 99 ping -i .001 -q 10.10.10.10
PING 10.10.10.10 (10.10.10.10) 56(84) bytes of data.
^C
--- 10.10.10.10 ping statistics ---
67441 packets transmitted, 67441 received, 0% packet loss, time 67678ms
rtt min/avg/max/mdev = 0.057/0.101/10.259/0.057 ms, pipe 2

and here is the lshw output:

root@router:~# lshw -class network
*-network
description: Ethernet interface
product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
vendor: Realtek Semiconductor Co., Ltd.
physical id: 0
bus info: pci@0000:03:00.0
logical name: enp3s0
version: 0c
serial: 1c:1b:0d:86:b9:6e
size: 100Mbit/s
capacity: 1Gbit/s
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress msix vpd bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=r8169 driverversion=6.1.0-18-rt-amd64 duplex=full firmware=rtl8168g-2_0.0.1 02/06/13 ip=10.1.65.122 latency=0 link=yes multicast=yes port=twisted pair speed=100Mbit/s
resources: irq:18 ioport:e000(size=256) memory:d0604000-d0604fff memory:d0600000-d0603fff


I'm at the shop for a good while today so I plan on working through the rest of the suggestions in this thread today.

Thanks!

Scott

Please Log in or Create an account to join the conversation.

More
19 Jul 2024 14:30 #305654 by PCW
rtt min/avg/max/mdev = 0.057/0.101/10.259/0.057 ms, pipe 2

> 10 ms suggests either network power management is enabled
(in the BIOS) or a problem with the RTK driver. I would try
using the DKMS RTK 8168 driver:

docs.google.com/document/d/1jeV_4VKzVmOI...ading=h.7ydcp08f6tyu
The following user(s) said Thank You: shasse

Please Log in or Create an account to join the conversation.

More
19 Jul 2024 16:32 #305660 by shasse
I updated the RTK 8168 driver following the procedure in Rod's document, and that seems to have resolved the issue. I did do that previously, but after setting the "r8168.aspm=0 r8168.eee_enable=0 pcie_aspm=off loglevel=3" kernel parameters in /etc/default/grub I previously had neglected to run "sudo update-grub". So the kernel parameters were not actually applied previously. When I worked through it this time I did that and that seems to have made a significant difference.

Rod: Your detailed document was extremely helpful. It might be worth adding a small blurb about editing /etc/default/grub and running sudo update-grub to get them to stick, or point folks to an existing location where that procedure is documented. I struggled to find a canonical "clean" location of documentation for how to do that. askubuntu.com/questions/19486/how-do-i-a...ernel-boot-parameter was the closest I could find.

But this issue is resolved for me now. Thanks so much Rod and Peter!

Please Log in or Create an account to join the conversation.

More
19 Jul 2024 21:42 - 19 Jul 2024 21:42 #305672 by PCW
I wonder how much of the improvement is changing the driver and how much is 
just disabling EEE  and ASPM  (either of which would be fatal for network latency)

Maybe the old driver does not give you access to those settings...
 
Last edit: 19 Jul 2024 21:42 by PCW.

Please Log in or Create an account to join the conversation.

More
20 Jul 2024 02:32 #305678 by shasse
I can try to isolate that and I'll let you know.

Please Log in or Create an account to join the conversation.

Moderators: PCWjmelson
Time to create page: 0.100 seconds
Powered by Kunena Forum