Mesa 7I76e and step rates and latency
01 Oct 2017 01:55 #99712
by blazini36
Mesa 7I76e and step rates and latency was created by blazini36
I'm working with a mesa 7i76e with a stepper driven machine with only 1 axis at this point (will have at least 2). The program and GUI is completely custom. Currently I'm getting some crashes and freezes while the program is running automated movements but I'm realizing this is likely due to a camera and how it's being displayed in the GUI using up too many resources which is something we have to sort out anyway. Not using any Gcode stuff, just jogging between positions set in the python code.
I get slightly confused by the 7i76e as it only runs on servo-thread and alot of info I find revolves around the base thread. So I have to fix the GUI camera issue but I'm trying to see what are the realistic expectations of the system when that is sorted as I likely have to change the step rate as well to add a geared stepper and larger pitch belt to reduce some jerkiness. I'm calculating the step scale to be at ~1700 with 1 and 2 inch per second automated jog movements.
Running a latency test with the base thread disabled and a few GLXgears running nets about 30 msec jitter. The camera displayed properly in a separate program runs jitter up to almost 150 msec. I assume that LinuxCNC with the current GUI displaying the camera as is would net a higher jitter based on the latency of the image displayed and may be the cause of the crashes but I have not tested latency with the GUI running as is.
So assuming that the best I can do is 150 msec jitter with the PC as is (it's not really feasible to modify this PC much), I'm trying to figure out if I'm going to wind up with a reliable system when all is said and done. The servo-period is set at 1000000 and I'm wondering if I should change that as well.
I get slightly confused by the 7i76e as it only runs on servo-thread and alot of info I find revolves around the base thread. So I have to fix the GUI camera issue but I'm trying to see what are the realistic expectations of the system when that is sorted as I likely have to change the step rate as well to add a geared stepper and larger pitch belt to reduce some jerkiness. I'm calculating the step scale to be at ~1700 with 1 and 2 inch per second automated jog movements.
Running a latency test with the base thread disabled and a few GLXgears running nets about 30 msec jitter. The camera displayed properly in a separate program runs jitter up to almost 150 msec. I assume that LinuxCNC with the current GUI displaying the camera as is would net a higher jitter based on the latency of the image displayed and may be the cause of the crashes but I have not tested latency with the GUI running as is.
So assuming that the best I can do is 150 msec jitter with the PC as is (it's not really feasible to modify this PC much), I'm trying to figure out if I'm going to wind up with a reliable system when all is said and done. The servo-period is set at 1000000 and I'm wondering if I should change that as well.
Please Log in or Create an account to join the conversation.
01 Oct 2017 02:57 #99716
by PCW
Replied by PCW on topic Mesa 7I76e and step rates and latency
The 7I76Es step rate is not related to latency or the servo thread rate
step rates to 10 MHz are possible.
Latency excursions will not cause LinuxCNC to crash but may increase the
following error. If LinuxCNC is crashing that's another issue
step rates to 10 MHz are possible.
Latency excursions will not cause LinuxCNC to crash but may increase the
following error. If LinuxCNC is crashing that's another issue
Please Log in or Create an account to join the conversation.
01 Oct 2017 04:57 #99722
by blazini36
Replied by blazini36 on topic Mesa 7I76e and step rates and latency
In the stepper docs it mentions linuxcnc freezing or locking up if the base thread is set too low which I thought somehow applied to the servo thread in the case of the 7i76e.
So I assume it's all hardware step generation on the 7i76e then? What kind of info is the driver sending the board on a jog command? The 7i76e has seemed pretty accurate and I haven't had any issues with it at all. Kind of makes me wonder why anyone would bother using something else especially if it's not much affected by latency.
This is a small industrial PC so I can't really change much on it. I did turn off all power saving stuff and HT in the Bios. I do get some following errors in Axis when running the camera in a separate program. This PC has 2 gigabit ethernet cards which I believe are on the USB 3.0 bus and they are recognized at the bios level. The camera is also USB 3.0.
I do realize the way the program is handling the camera is probably most of the lock up issue, CPU usage goes pretty close to 100% vs around 60% under other camera apps. So aside from the lock-ups, is 150msec latency acceptable for the 7i76e and do you recommend adjusting the servo-period?
So I assume it's all hardware step generation on the 7i76e then? What kind of info is the driver sending the board on a jog command? The 7i76e has seemed pretty accurate and I haven't had any issues with it at all. Kind of makes me wonder why anyone would bother using something else especially if it's not much affected by latency.
This is a small industrial PC so I can't really change much on it. I did turn off all power saving stuff and HT in the Bios. I do get some following errors in Axis when running the camera in a separate program. This PC has 2 gigabit ethernet cards which I believe are on the USB 3.0 bus and they are recognized at the bios level. The camera is also USB 3.0.
I do realize the way the program is handling the camera is probably most of the lock up issue, CPU usage goes pretty close to 100% vs around 60% under other camera apps. So aside from the lock-ups, is 150msec latency acceptable for the 7i76e and do you recommend adjusting the servo-period?
Please Log in or Create an account to join the conversation.
01 Oct 2017 16:05 - 01 Oct 2017 19:15 #99736
by PCW
Replied by PCW on topic Mesa 7I76e and step rates and latency
If your Ethernet MACS are connected via USB I dont think it can ever be reliable for real time
Can you post your dmesg results here? That would reveal the Ethernet hardware
150 usec latency in the latency test is possibly OK but the Ethernet latency is what's important
You can monitor the peak servo thread time (which will include the Ethernet latency)
by watching the parameter servo-thread.tmax.
Ideally this should never be more than about 1/2 of the value of the pin motion.servo.last-period
(both of these times are in CPU clocks)
The 7I76E (and all Mesa Stepgen hardware) receive velocity commands
and sends back the position every servo cycle (in 1/65536's of a step resolution values)
An external feedback system corrects for the minor errors due to timing variations
between LinuxCNC and the hardware to keep the position correct within a small fraction of an
externally generated step
If you miss a servo thread deadline (and skip a whole cycle) you will only get a significant
error during acceleration/decceleration. This error is still quite small for normal machine accelerations,
for example if you have a 1G acceleration (= ~400 IPS/S = high for most machines), and missed a 1 ms servo
thread, in the worst case, this would result in a missed velocity delta update of 0.4 IPS resulting in a path deviation of 0.2 mills
Can you post your dmesg results here? That would reveal the Ethernet hardware
150 usec latency in the latency test is possibly OK but the Ethernet latency is what's important
You can monitor the peak servo thread time (which will include the Ethernet latency)
by watching the parameter servo-thread.tmax.
Ideally this should never be more than about 1/2 of the value of the pin motion.servo.last-period
(both of these times are in CPU clocks)
The 7I76E (and all Mesa Stepgen hardware) receive velocity commands
and sends back the position every servo cycle (in 1/65536's of a step resolution values)
An external feedback system corrects for the minor errors due to timing variations
between LinuxCNC and the hardware to keep the position correct within a small fraction of an
externally generated step
If you miss a servo thread deadline (and skip a whole cycle) you will only get a significant
error during acceleration/decceleration. This error is still quite small for normal machine accelerations,
for example if you have a 1G acceleration (= ~400 IPS/S = high for most machines), and missed a 1 ms servo
thread, in the worst case, this would result in a missed velocity delta update of 0.4 IPS resulting in a path deviation of 0.2 mills
Last edit: 01 Oct 2017 19:15 by PCW.
The following user(s) said Thank You: blazini36, helioz2000
Please Log in or Create an account to join the conversation.
01 Oct 2017 23:32 #99754
by blazini36
Replied by blazini36 on topic Mesa 7I76e and step rates and latency
That's good info, thanks
I had to shorten it up a bit but this is probably everything relevent.
This was pretty good troubleshooting advice, it allowed me to narrow some things down and verify a few things. Hopefully these screenshots aren't too outrageously large. I took some screenshots with my desktop PC controlling the machine PC over VNC. VNC obviously adds some overhead and ties up the other ethernet card as well.
This shot is with my GUI config with the camera not displaying anything. From what you've said, this should be plenty acceptable?
This shot is with an Axis config running and the Camera being displayed by the separate qv4l2 app which is also likely acceptable.
And this is the shot that's likely to be a problem. The image processing with in my program is causing the tmax to run much higher.
The thing is that I don't have any noticeable missed steps. Positional accuracy on this thing is important but it's not something that's going to cause out of tolerance parts. It basically runs either from end to end repeatedly or cycles through stored positions over and over, so accumulated errors could be an issue though I haven't seen them yet. Right now it's using a direct drive stepper on a 2mm belt, looking at 40mm/rev or a step scale of 508 and speed of 1-3ips (0.5-1.5khz?). I'm likely to change to a 5:1 geared stepper and 3mm belt to smooth it out a bit. Doing that to increase the step rate and hopefully eliminate the slow speed vibrations. That's still a relatively low step rate and I suppose a missed step in that case is even less critical since the output angle goes from 1.8*/step to 0.35*/step.
Would you say it's likely that the lockups and crashes are due to resource usage in the python code rather than LinuxCNC being unhappy about something?
I had to shorten it up a bit but this is probably everything relevent.
user1@viewer1 ~ $ dmesg | grep eth
[ 5.737919] igb 0000:01:00.0: added PHC on eth0
[ 5.737923] igb 0000:01:00.0: eth0: (PCIe:2.5Gb/s:Width x1) 00:0e:c4:d1:3f:93
[ 5.737925] igb 0000:01:00.0: eth0: PBA No: FFFFFF-0FF
[ 5.771182] igb 0000:02:00.0: added PHC on eth1
[ 5.771185] igb 0000:02:00.0: eth1: (PCIe:2.5Gb/s:Width x1) 00:0e:c4:d1:3f:94
[ 5.771187] igb 0000:02:00.0: eth1: PBA No: FFFFFF-0FF
[ 13.143580] igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Up 100 Mbps Full Duplex, Flow Control: RX/TX
[ 14.479822] igb 0000:02:00.0 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
user1@viewer1 ~ $ dmesg | grep usb
[ 5.598483] usbcore: registered new interface driver usbfs
[ 5.598498] usbcore: registered new interface driver hub
[ 5.599550] usbcore: registered new device driver usb
[ 5.602944] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[ 5.602947] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 5.602948] usb usb1: Product: xHCI Host Controller
[ 5.602950] usb usb1: Manufacturer: Linux 4.9.35-rt25mah xhci-hcd
[ 5.602951] usb usb1: SerialNumber: 0000:00:14.0
[ 5.621614] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003
[ 5.621616] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 5.621618] usb usb2: Product: xHCI Host Controller
[ 5.621619] usb usb2: Manufacturer: Linux 4.9.35-rt25mah xhci-hcd
[ 5.621621] usb usb2: SerialNumber: 0000:00:14.0
[ 5.663190] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002
[ 5.663192] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 5.663194] usb usb3: Product: EHCI Host Controller
[ 5.663196] usb usb3: Manufacturer: Linux 4.9.35-rt25mah ehci_hcd
[ 5.663197] usb usb3: SerialNumber: 0000:00:1d.0
[ 5.942947] usb 1-6: new high-speed USB device number 2 using xhci_hcd
[ 5.990948] usb 3-1: new high-speed USB device number 2 using ehci-pci
[ 6.084231] usb 1-6: New USB device found, idVendor=05e3, idProduct=0608
[ 6.084234] usb 1-6: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[ 6.084235] usb 1-6: Product: USB2.0 Hub
[ 6.139243] usb 3-1: New USB device found, idVendor=8087, idProduct=8000
[ 6.139246] usb 3-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 8.454991] usb 2-3: new SuperSpeed USB device number 2 using xhci_hcd
[ 8.475891] usb 2-3: LPM exit latency is zeroed, disabling LPM.
[ 8.476631] usb 2-3: New USB device found, idVendor=199e, idProduct=9089
[ 8.476634] usb 2-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 8.476635] usb 2-3: Product: DFK 33UX264
[ 8.476637] usb 2-3: Manufacturer: The Imaging Source Europe GmbH
[ 8.476638] usb 2-3: SerialNumber: 08714211
[ 8.525616] usbcore: registered new interface driver uvcvideo
user1@viewer1 ~ $ dmesg | grep xhci
[ 5.601643] xhci_hcd 0000:00:14.0: xHCI Host Controller
[ 5.601653] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1
[ 5.602738] xhci_hcd 0000:00:14.0: hcc params 0x200077c1 hci version 0x100 quirks 0x0004b810
[ 5.602743] xhci_hcd 0000:00:14.0: cache line size of 64 is not supported
[ 5.602950] usb usb1: Manufacturer: Linux 4.9.35-rt25mah xhci-hcd
[ 5.621541] xhci_hcd 0000:00:14.0: xHCI Host Controller
[ 5.621548] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2
[ 5.621619] usb usb2: Manufacturer: Linux 4.9.35-rt25mah xhci-hcd
[ 5.942947] usb 1-6: new high-speed USB device number 2 using xhci_hcd
[ 8.454991] usb 2-3: new SuperSpeed USB device number 2 using xhci_hcd
This was pretty good troubleshooting advice, it allowed me to narrow some things down and verify a few things. Hopefully these screenshots aren't too outrageously large. I took some screenshots with my desktop PC controlling the machine PC over VNC. VNC obviously adds some overhead and ties up the other ethernet card as well.
This shot is with my GUI config with the camera not displaying anything. From what you've said, this should be plenty acceptable?
This shot is with an Axis config running and the Camera being displayed by the separate qv4l2 app which is also likely acceptable.
And this is the shot that's likely to be a problem. The image processing with in my program is causing the tmax to run much higher.
The thing is that I don't have any noticeable missed steps. Positional accuracy on this thing is important but it's not something that's going to cause out of tolerance parts. It basically runs either from end to end repeatedly or cycles through stored positions over and over, so accumulated errors could be an issue though I haven't seen them yet. Right now it's using a direct drive stepper on a 2mm belt, looking at 40mm/rev or a step scale of 508 and speed of 1-3ips (0.5-1.5khz?). I'm likely to change to a 5:1 geared stepper and 3mm belt to smooth it out a bit. Doing that to increase the step rate and hopefully eliminate the slow speed vibrations. That's still a relatively low step rate and I suppose a missed step in that case is even less critical since the output angle goes from 1.8*/step to 0.35*/step.
Would you say it's likely that the lockups and crashes are due to resource usage in the python code rather than LinuxCNC being unhappy about something?
Please Log in or Create an account to join the conversation.
02 Oct 2017 00:57 #99757
by PCW
Replied by PCW on topic Mesa 7I76e and step rates and latency
I would not expect lost steps from timing violations unless they we really long (multiple milliseconds)
and caused a step motor stall due to a larger velocity step than the motors can follow.
If you do not have high accelerations you might try slowing the servo thread to say 500 Hz (2 ms)
and see if that eliminates the latency warning issues
My guess is that any hangs/crashes are more likely due to bugs than resource exhaustion
Have you run "top" when the system is running with video?
Your Ethernet MACs are Realtek PCIE chips so fine for realtime
and caused a step motor stall due to a larger velocity step than the motors can follow.
If you do not have high accelerations you might try slowing the servo thread to say 500 Hz (2 ms)
and see if that eliminates the latency warning issues
My guess is that any hangs/crashes are more likely due to bugs than resource exhaustion
Have you run "top" when the system is running with video?
Your Ethernet MACs are Realtek PCIE chips so fine for realtime
Please Log in or Create an account to join the conversation.
02 Oct 2017 04:27 #99761
by blazini36
Replied by blazini36 on topic Mesa 7I76e and step rates and latency
I think I figured it out and sent my findings to the python guy. There is a bug in the program where using a particular function can't tolerate a limit switch error and locks the GUI up. There were missed steps possibly mechanical, so the next commanded move sent the carriage into the limit prox and LinuxCNC did not print the limit switch error. I reduced the pulley size to increase torque and reduce the distance/rev to reduce the mechanical misstepping.
I was having trouble finding info on setting acceleration. I'm set like this:
MAX_VELOCITY = 4
MAX_ACCELERATION = 30
STEPGEN_MAXVEL = 5
STEPGEN_MAXACCEL = 37
I basically set the acceleration so I have good deceleration, There's alot of starts, stops, and direction changes on this thing. Would this be considered "high"?
I looked at hardware monitor and the camera display is maxing out both CPU cores in while running in the config. It's close to 100% on both cores, much higher than the qv4l2. I was worried about the integrated GPU not having the guts for the camera but it turns out when running "intel_gpu_top" it's not offloading to the GPU at all. Qv4l2 increases GPU usage 35% which probably explains less CPU usage.
I assumed the Ethernet cards were on USB3 bus because when I first tried to this thing up I used the LinuxCNC Debian Wheezy ISO. I forget what the issue was but I had to disable XHCI (USB3) to get it to boot. That made my ethernet cards not work so I couldn't even update the PC to try to fix it. Then I tried to use the LMDE2 ISO that's floating around this forum but my camera wouldn't work. After pulling my hair out I realized the Kernel was not compiled with the USBvideo module enabled. So I wound up compiling a newer kernel myself for LMDE2.
I was having trouble finding info on setting acceleration. I'm set like this:
MAX_VELOCITY = 4
MAX_ACCELERATION = 30
STEPGEN_MAXVEL = 5
STEPGEN_MAXACCEL = 37
I basically set the acceleration so I have good deceleration, There's alot of starts, stops, and direction changes on this thing. Would this be considered "high"?
I looked at hardware monitor and the camera display is maxing out both CPU cores in while running in the config. It's close to 100% on both cores, much higher than the qv4l2. I was worried about the integrated GPU not having the guts for the camera but it turns out when running "intel_gpu_top" it's not offloading to the GPU at all. Qv4l2 increases GPU usage 35% which probably explains less CPU usage.
I assumed the Ethernet cards were on USB3 bus because when I first tried to this thing up I used the LinuxCNC Debian Wheezy ISO. I forget what the issue was but I had to disable XHCI (USB3) to get it to boot. That made my ethernet cards not work so I couldn't even update the PC to try to fix it. Then I tried to use the LMDE2 ISO that's floating around this forum but my camera wouldn't work. After pulling my hair out I realized the Kernel was not compiled with the USBvideo module enabled. So I wound up compiling a newer kernel myself for LMDE2.
Please Log in or Create an account to join the conversation.
03 Oct 2017 01:28 #99796
by blazini36
Replied by blazini36 on topic Mesa 7I76e and step rates and latency
Due to the beauty of Linux I was able to actually move the machine PC HDD to my desktop and run the ethernet cable into the 7i76e, with minimal changes I was able to boot the entire setup. LMDE2 was running in software mode so that was eating up some CPU and I still saw 1/3 the servo-thread latency.
Now I saw "motion.servo.last-period" at nearly 5000000 while servo-thread.tmax was around 1000000-1500000 iirc. So I'm assuming the higher the "motion.servo.last-period" the better?
It seems we can't change much about the camera usage and even though I may be able to work around the latency there are camera performance issues. So I think I'll have to retire this little PC until I build a CNC mill or something. Latency seems pretty decent with just LinuxCNC and some normal tasks.
Now I saw "motion.servo.last-period" at nearly 5000000 while servo-thread.tmax was around 1000000-1500000 iirc. So I'm assuming the higher the "motion.servo.last-period" the better?
It seems we can't change much about the camera usage and even though I may be able to work around the latency there are camera performance issues. So I think I'll have to retire this little PC until I build a CNC mill or something. Latency seems pretty decent with just LinuxCNC and some normal tasks.
Please Log in or Create an account to join the conversation.
03 Oct 2017 01:55 #99800
by PCW
Replied by PCW on topic Mesa 7I76e and step rates and latency
Both times are in CPU clocks so its the ratio that's important
(divide both by CPU clock frequency to get times in seconds) so
motion.servo.last-period should be about CPU CLOCK/1000
at 1 ms servo thread
(divide both by CPU clock frequency to get times in seconds) so
motion.servo.last-period should be about CPU CLOCK/1000
at 1 ms servo thread
Please Log in or Create an account to join the conversation.
Time to create page: 0.217 seconds