Systematic approach to tracking down latency issue

More
06 Mar 2019 01:42 #127847 by tommylight

Any btw: I stocked up on Core-II-Duo based Dell Optiplexes and Fujitsus when they were still available.
They have enough CPU horsepower for LinuxCNC with Axis and perform much better latency-wise than anything more modern I tried.

I did the same thing with Dell, but that stock is wearing thin lately !
Oh yeah, stay away from new Nvidia cards. New Radeon works fine......until you try to install the 3D drivers, a new install is inevitable ! Have an RX580 in the main computer that i use daily.

Please Log in or Create an account to join the conversation.

More
06 Mar 2019 19:12 #127906 by RichJordan
Thanks for your reply Hase.

I'm going to check memory usage and the SMART capability of hard drive.

I use the Mesa FPGA boards (5i25 specifically) at work to run automation prototypes. No problems using those. I think I'll buy one. They run on a 1ms servo thread, right? I left my machine with the latency profiler running over the course of about 9-10 hours and found a max spiked latency of 2.5ms - this could be a problem with the FPGA boards. Any thoughts about how they might function with a late latency?

Anyway, thanks again for all your thoughts.

Richard

Please Log in or Create an account to join the conversation.

More
07 Mar 2019 09:33 #127961 by hase
The servo thread is scheduled at 1ms intervals, indeed.
This is the thread where LinuxCNC trcks the position of the tool (in the coordinate space) and calculates the next moves.
Basically, the output is the movement speed of the tool in space (and derived from that the speed of each joint) for the next interval.

If this is delayed by 2.5 periods (1ms schedule and 2.5ms lag in your case), that *will* result in problems: no matter how smart the servo thread is, this lag it cannot compensate for; I think. Not entirely sure.

To be honest: I use LinuxCNC only in my hobby shop and tend to be lazy about it: I have not analyzed the server thread and how smart it is at handling lag, i.e. the difference between "i should have run at time x, but current time is x + delta".
Still: a 2.5ms lag is hard or impossible to correct for.

When I had the problem with the SMART readout from the harddrive, the lags were even longer (like 10 times longer) at the peak, but iirc. the 2.5ms could indicate a relation.

But back to the servo thread: the Mesa card runs on its internal crystal oscillator. not really on the 1ms servo schedule.
But the servo thread runs on that schedule (see above), and it gives the commands (speed of each joint) to the card, which then generates step pulses corresponding to that desired speed.
And again: that speed will be the wrong one if the calculation stopped for too long a time (and 2.5ms is too long :-)

My problem with the latency test in 2014 was this: it showed me a measured value and a peak.
But even on a system where the latency test showed no problems, actual LinuxCNC had one. Axis shows this as a popup.
This is easily missed and when you click on it, it goes away and never appears again - because all future errors are ignored. This is a bit dangerous (but I support the design decision behind it, btw.).
And I have at least one system where LinuxCNC runs fine, but the latency test shows large latency peaks.

That btw. was my reason to start this thread back in the day: the fact that the latency test obviously uses a different method to measure the delay between calls to threads, or a different thread setup or some other difference to an actual LinuxCNC thread setup.

So I followed all the advice in the wiki like
- disabling CPU power save and sleep states (since I do not care about saving a Watt there)
- binding Interrupt handling to a core (well - did that, no effect actually, but no harm either)
and in the BIOS setting I disabled everything I thought could influence RT performance: all unnecessary peripherals virtualization etc.
A really unsystematic, shotgun-type approach :-)

So short morale: you must test with an actual machine setup and see, if Axis reports errors.
I ended up running the actual machine (with drives disables to conserve power and spare my nerves of the noise :-) for a day or so. Just the actual machine config and a repeating G-code sequence resembling actual parts (I just created a file with actual parts in it in CAM and edited it into an endless loop). I let that run as an air cut (no tool) and then disabled the drive power and let the controller run over the weekend.

And when the machine worked fine within my parameters for "fine", I invoked the cardinal rule of IT ("never change a running system") and let layziness win over the engineers desire for asystematic approach to the problem :-)
The following user(s) said Thank You: RichJordan

Please Log in or Create an account to join the conversation.

More
07 Mar 2019 16:55 #128001 by Mike_Eitel
@PCW
I wonder what happens in reality in a mesa stepper configuration when by latency two or three servocycles go lost.
Are the according steps lost or generates that something like an "Input" jump with according ferror?
Mike

Please Log in or Create an account to join the conversation.

More
07 Mar 2019 21:08 #128031 by Todd Zuercher
It will react very similarly to how a servo would in the same situation. What happens depends on if there was a velocity command change during the latency overrun, and how large of a change there was and if the latency over-run is large enough the watchdog will bite, and shut down the system (more than 4 cycles will probably cause a shut down) Basically any commanded change in pulse rate will be delayed by the bad timing, then the system will correct if any correction is needed. (The actual creation of pulses is closed loop like on a real servo.)

Please Log in or Create an account to join the conversation.

More
07 Mar 2019 21:54 #128032 by Mike_Eitel
Yes, maybe...
But I never read about some kind of regulator (position / speed) inside the Mesa card.
I guess it is similar to position regulation, means pc reads actual pos, calculates against his internal model and then sends commanded pos to card.

That's why I would love to get explained from pcw how it is implemented.
Mike

Please Log in or Create an account to join the conversation.

More
07 Mar 2019 21:56 #128033 by PCW
In addition to what Todd stated, If you are moving at a constant velocity, nothing would happen, except the control loop in LinuxCNC will actually make a bogus correction because it will have new position command data but stale position feedback data.
You can avoid this bogus correction with some hal plumbing so an occasional dropped or timed-out update causes minimal disruption

The actual (transient) error caused by a late velocity update is fairly small.
on nominal acceleration machines,say you have a CNC machine with 1/4G
acceleration and a 1 ms servo thread:

1/4G = ~100 IPS/S, 100 IPS/S = 0.1 IPS/ms
a velocity error of 0.1 IPS for 1 ms = .0001"
The following user(s) said Thank You: Mike_Eitel, RichJordan

Please Log in or Create an account to join the conversation.

More
07 Mar 2019 22:14 #128034 by Mike_Eitel
THX
I think this clarifies once for ever that sometimes few milliseconds missed for latency will not be remarked in normal machines.
I know why I like your products

And to be even more clear:
I can not understand why some people's give for good reason good money for good mechanics, and then try to spare few dollars to use chepo parallel port solutions. Especially when sw "pulsing" can never be so smooth/precise as hw generation.
m5c
Mike

Please Log in or Create an account to join the conversation.

More
08 Mar 2019 01:12 #128052 by tommylight

I can not understand why some people's give for good reason good money for good mechanics, and then try to spare few dollars to use chepo parallel port solutions.

I have to agree with you on that.
Parallel port is ....... was a mighty device. I used it to copy files from one PC to another some 20 odd years ago as it was much faster than any floppy at that time. And it never fails to work.
And i have a lot of Mesa boards, a lot!
It all depends on a lot of factors.

Please Log in or Create an account to join the conversation.

More
08 Mar 2019 06:48 #128060 by Mike_Eitel
Yes. And for light small, or low efforts machines or "just to have fun" etc. Linuxcnc parport solution is a super solution. Good for both worlds. Big THX for the developers.

Please Log in or Create an account to join the conversation.

Time to create page: 0.698 seconds
Powered by Kunena Forum