Mesa 7i92 Read Error

More
18 Jul 2019 13:57 #139805 by 10K
Replied by 10K on topic Mesa 7i92 Read Error
tommylight-
Thanks for the suggestion. The /etc/network/interfaces file had:
auto eth0
iface eth0 inet static
   address 10.10.10.1
   netmask 255.255.255.0

I changed it to
auto eth0
iface eth0 inet static
   address 10.10.10.12
   gateway 10.10.10.1
   dns-nameservers 10.10.10.1

On the test computer, the program ran for 24 hours without an error! I made the same changes on the computer hooked to the lathe, and it got the read error on the 7i92 after about 15 minutes.

Please Log in or Create an account to join the conversation.

More
18 Jul 2019 16:12 #139824 by tommylight
Replied by tommylight on topic Mesa 7i92 Read Error
That is nice, thank you.
Also check the other computer if it has an intel ethernet card, there is a line that PCW recomends to use with coalescence set to 0, that should inprove things a lot, i can not recal the exact line but do a search for it, it should be easy to find.
Not the one with irq coalescence, the one with eth.

Please Log in or Create an account to join the conversation.

More
18 Jul 2019 18:53 #139830 by 10K
Replied by 10K on topic Mesa 7i92 Read Error
I loaded the most recent version of G540x2D.bit. I ran the test on the 7i92 on the lathe computer, and it gave the read error after about 2,000,000 reads. The lights on the 7i92 are acting differently, so I'm confident I'm running the newer code. After the error, the /DONE and INIT lights are still off. Only the PWR light is on. None of the four green lights are on, which is different from what I saw before flashing the new code.

Please Log in or Create an account to join the conversation.

More
18 Jul 2019 19:15 #139831 by PCW
Replied by PCW on topic Mesa 7i92 Read Error
The green lights don't really mean much as they just count up on RX packets

To verify that you have the latest firmware you can run this command:

mesaflash --device 7i92 --addr 10.10.10.10 --verbose

Please Log in or Create an account to join the conversation.

More
21 Jul 2019 19:14 - 21 Jul 2019 19:24 #140100 by 10K
Replied by 10K on topic Mesa 7i92 Read Error
I tried adding
hardware-irq-coalesce-rx-usecs 0

to the interfaces file.

It almost halved the ping time to the 0.150 to 0.175 range, but it made the read error happen even faster. I had three consecutive tests that all errored in less than 15 minutes.

The computer network chip on the lathe machine is an Intel I217-V
Last edit: 21 Jul 2019 19:24 by 10K. Reason: Added chip info

Please Log in or Create an account to join the conversation.

More
21 Jul 2019 19:21 - 21 Jul 2019 19:22 #140101 by 10K
Replied by 10K on topic Mesa 7i92 Read Error
I ran
mesaflash --device 7i92 --addr 10.10.10.10 --verbose

and got the following output

ETH device 7I92 at ip=10.10.10.10
Communication:
transport layer: ethernet IPv4 UDP
ip address: 10.10.10.10
mac address: 00:60:1B:11:00:72
protocol: LBP16 version 3
Board info:
Flash size: 16Mb (id: 0x14)
Connectors count: 2
Pins per connector: 17
Connectors names: P2 P1
FPGA type: 6slx9tqg144
Number of leds: 4
Board firmware info:
memory spaces:
0: HostMot2 (registers, RW, 32-bit) [size=64K]
1: KSZ8851 (registers, RW, 16-bit)
2: EtherEEP (EEPROM, RW, 16-bit) , page size: 1, erase size: 1
3: FPGAFlsh (flash, RW, 32-bit) [size=16M], page size: 256, erase size: 65536
4: Timers (memory, RW, 16-bit)
6: LBP16RW (memory, RW, 16-bit)
7: LBP16RO (memory, RO, 16-bit)
[space 0] HostMot2
[space 2] Ethernet eeprom:
mac address: 00:60:1B:11:00:72
ip address: 10.10.10.10
board name: 7I92
user leds: eth debug
[space 3] FPGA flash eeprom:
flash size: 16Mb (id: 0x14)
[space 4] timers:
uSTimeStampReg: 0x4C22
WaituSReg: 0x0000
HM2Timeout: 0x0000
[space 6] LBP16 control/status:
packets received: all 97, UDP 19, bad 0
packets sended: all 20, UDP 19, bad 0
parse errors: 0, mem errors 0, write errors 0
error flags: 0x0000
debug LED ptr: 0x0008
scratch: 0x0000
[space 7] LBP16 info:
board name: 7I92
LBP16 protocol version 3
board firmware version 17
IP address jumpers at boot: fixed from EEPROM



I ran another test, and noticed with the new code that the INT light was on after the failure. According to your earlier post, this means that it failed on communication, not a power interruption.

I got the 7i92M that you sent for testing. Thanks! I've only run one test with it, and it ran for about 3 hours before getting the read error. This mostly confirms that it's not the board. Since I was getting such long runs on my test machine (the last one was 26 hours without a failure), I've reinstalled LinuxCNC + Debian Jessie from the .ISO image on the website. I'm currently running some tests. It might be a while before I have the results, since I won't be able to work on it for several weeks...
Last edit: 21 Jul 2019 19:22 by 10K. Reason: Fixed formatting

Please Log in or Create an account to join the conversation.

More
21 Jul 2019 19:35 #140103 by PCW
Replied by PCW on topic Mesa 7i92 Read Error
Weird. it really does sound like something very specific about your setup as I have not had any similar errors despite multi-year linuxcnc total run times.

Dumping the tmax values before and after a read error might provide a bit more insight
as would looking at the kernel log to see if the Ethernet link status changed

To dump the tmax parameters:

halcmd show param *.tmax

to look at the last few kernel log entries

dmesg | tail -n 10

Please Log in or Create an account to join the conversation.

More
21 Jul 2019 20:09 #140109 by tommylight
Replied by tommylight on topic Mesa 7i92 Read Error

I tried adding
hardware-irq-coalesce-rx-usecs 0


I do not recal ever PCW mentioning " IRQ coalescence ", Pl7i92 mentions it all the time.
PCW always writes " something ETH coalescence set to 0 ", not IRQ!
Again, i do not know if it makes any diference ( it should ), but could you please try that and report back.
I did try it on a Dell E6510 that has terrible latency and i use it to do tests with it and a 7i92, it went from erroring after 1 to 4 minutes to over 25 minutes.
Just a sec....

forum.linuxcnc.org/27-driver-boards/3559...ethernet-mesa-boards
Give it a try, see if it helps.

Please Log in or Create an account to join the conversation.

More
28 Jul 2019 00:50 #140780 by 10K
Replied by 10K on topic Mesa 7i92 Read Error
I made the following changes to the Ethernet settings:

etc/network/interfaces file
auto eth0
iface eth0 inet static
  address 10.10.10.12
  netmask 255.255.255.0

Ethernet connection 1 - IPv4 Settings
Address 10.10.10.12
Netmask 8
Gateway 10.10.10.1
DNS servers 10.10.10.1

I installed ethtool
sudo apt-get install ethtool

I ran
ping 10.10.10.10

before and after I issued this command
sudo ethtool -C eth0 rx-usecs 0

The results of the ping are shown in the attachment

The average ping dropped from 0.406 to 0.257, a 37% decrease. I saw something I saw before and discounted - the initial ping time was much longer than the remaining pings. I'm not sure if anything can be done about this.

I ran a test on the lathe computer, again using only the 7i92M and an external 5V power supply. The test ran for 18 hours without an error. I had to stop it at that point because I was going to be away from it for several weeks.

So, very promising results so far. When I get back to it, I want to test it with the 7i92 in the electronics enclosure. I want to see if it's just the ethtool command that's giving better results, or the other changes I made to network settings.

I see that I can make the ethtool changes persistent by adding the following to the /etc/network/interfaces file
pre-up /sbin/ethtool -C eth0 rx-usecs 0

I also would like to be able to put the computer back on my local ethernet, so that's another thing to test. For this i'll have to change the settings made to Ethernet connection1, since they disable all other ethernet connections.
Attachments:

Please Log in or Create an account to join the conversation.

More
31 Jul 2019 05:52 - 31 Jul 2019 05:54 #140984 by Qenf
Replied by Qenf on topic Mesa 7i92 Read Error
I have the same problem, absolutely random communication breaks. In this board, the old firmware. First, I changed the hm2_7i92.0.packet-error-limit parameter from 10 to 1000 (100 is enough) and this removed the problem. I also have a board with a new firmware and it works without errors. At first I thought it was about the firmware, but then I decided to swap the hardware and rewired the KSZ8851 chip. And the problem moved to the second board. Chip is KSZ8851-16 MLL (1536A3T M156C47M02). I bought a new chip, soldered it, and now both boards work stably.
Last edit: 31 Jul 2019 05:54 by Qenf.

Please Log in or Create an account to join the conversation.

Moderators: PCWjmelson
Time to create page: 0.097 seconds
Powered by Kunena Forum