Successfully configured with stepconf: but +Y-axis "stutters" running G-code

  • clunc
  • clunc's Avatar Topic Author
  • Offline
  • Elite Member
  • Elite Member
More
03 Apr 2022 19:26 - 03 Apr 2022 20:40 #239206 by clunc
Update: standalone DVI-D Radeon card has arrived and been installed; if anything latency seems worse (+130k ns base jitter)

Answering the last question first:
Machine: Dell Optiplex 780, with 10 GB RAM[1] (hwinfo)
CPU: model name    : Intel(R) Core(TM)2 Duo CPU     E7600  @ 3.06GHz (hwinfo)
Kernel: Linux aquino 5.4.177-rt69 #2 SMP PREEMPT_RT Sun Feb 20 21:43:22 CST 2022 x86_64 x86_64 x86_64 GNU/Linux (uname)

[1] HWinfo did show up one curiosity that may be doing something; it says this about:
Physical Memory Array: #4096
    Location: 0x03 (Motherboard)
    Slots: 4
    Max. Size: 8 GB

However, there are actually (2) 4 GB and (1) 2 GB RAM modules installed, in a 4-2-4 (DIMM-1-DIMM-2-DIMM-3) configuration for a total of 10GB. I could try reducing that to 2 GB and placing the module in the DIMM-1 slot to see if there's a lot of overhead in the memory access.

Nebur,

Firstly, I'm terribly sorry for the drawn-out delay in replying.  I started to reply immediately, but then had to get up to go see about the hardware and kernel information you asked about when... Other Things started breaking--including a basement window which fell out in high winds because my teen-aged boys put it in five years ago and secured it with construction wedges.  Car troubles too, and unscheduled trips and and. It went from too-bad to ridiculous.
And secondly, I want to thank your for your offer to help.

Did you optimize the bios settings?
In a nutshell: Deactivate every power management feature you can find and disable every feature and onboard device you don't need. Stuff like speedstep etc.

I spent some time months ago playing with preempt-rt, different machines, vga, core isolation, interrupt routing... yada yada to squeeze 'more rt' out of machines. There is quite a lot of stuff that can be done... see attachment for a nice result while torture testing.

But a 'clean' bios is a good initial step. If suitable bios options to get power / clock management under control aren't available, kernel params can be used to get things under control. The kernel documentation contains all the info required.
 
I turned off everything I could see: effectively, telling the BIOS "No. Don't help."

Isolcpus isn't worthless btw. It won't turn a lemon into something brilliant but depending on how intertwined the cores are (e.g. cache layout can require to sacrifice more than one core) it can be combined with a proper irq routing to keep a core 'clean' and reduce an 'acceptable' latency by an order of magnitude.
And that's a question I have: Is the Dell Optiplex I've jumped to a lemon as far as a software-stepper for LinuxCNC is concerned?
As I recall leaving it, I have set isolcpus to first one then the other core and noticed no difference or improvement.

Also my experience on old intel machines has been that onboard vga produced the least noise in terms of latency artifacts.
...which suggests that I probably should be looking elsewhere than video for improvement.

What CPU and kernel version are you running exactly? I might have dealt with a similar machine and have more concrete tips or 'kernel spells' I could share :-)
(Info moved to top)

 

Last edit: 03 Apr 2022 20:40 by clunc. Reason: correct

Please Log in or Create an account to join the conversation.

  • tommylight
  • tommylight's Avatar
  • Away
  • Moderator
  • Moderator
More
03 Apr 2022 20:33 #239212 by tommylight
TL:DR
Get the 1Gb stick out, put the 2 remaining 4GB sticks into slots with BLACK ears/levers, not the white ones.
Test again.
Additional info:
DDR = dual channel Dynamic RAM = Random access memory
Meaning when 2 or 4 sticks with the same capacity and speed and latency and ... are used, the processor can access both channels at the same time. But whenever there is a "not the same" memory stick, it will ruin the memory performance.

Please Log in or Create an account to join the conversation.

  • clunc
  • clunc's Avatar Topic Author
  • Offline
  • Elite Member
  • Elite Member
More
03 Apr 2022 20:40 #239214 by clunc
The following user(s) said Thank You: tommylight

Please Log in or Create an account to join the conversation.

More
03 Apr 2022 23:44 - 04 Apr 2022 00:02 #239217 by Nebur
As Tommy said - ditch the odd DIMM and stay on two sticks.
Additionally ditch the discrete GPU and use the internal VGA.

Regarding bios settings have a look at the attached pdf. I highlighted settings I would recommend.

Kernel params that can help (see www.kernel.org/doc/html/v5.4/admin-guide/kernel-parameters.html for details):
isolcpus=1 irqaffinity=0 processor.max_cstate=0

After rebooting enter the terminal and run
watch -d -n1 cat /proc/interrupts
You should see the interrupt count mostly increase on CPU0. CPU1 should see only rescheduling and timer ints.

Generally core2 machines are pretty good but in this case I suspect a 'lemon'. Those huge latency glitches you have are most likely some weird Dell SMIs that you possibly won't be able to get rid off. But you might be lucky... 
Edit:
I totally misread the latency histogram and jumble the off-chart counts with time. ~30 microseconds isn't really that bad. Are we chasing the real issue for the stuttering axis?
Attachments:
Last edit: 04 Apr 2022 00:02 by Nebur. Reason: Correction
The following user(s) said Thank You: tommylight

Please Log in or Create an account to join the conversation.

  • tommylight
  • tommylight's Avatar
  • Away
  • Moderator
  • Moderator
More
03 Apr 2022 23:51 #239219 by tommylight
Are you aware that you can use the PC with pretty bad latency by raising the base period till you do not get latency warnings?
Also are you aware that you can install RTAI kernel that has much better latency, but, the new RTAI might not work at all, so some experimenting with older RTAI might prove valuable.
I have some laptops with terrible latency for testing with 300000 base period. This does limit the step rate a lot, though.
If you are using the 2.8.2 official ISO, give RTAI a try, might surprise you:
linuxcnc.org/docs/2.8/html/getting-start...#cha:Installing-RTAI

Please Log in or Create an account to join the conversation.

  • clunc
  • clunc's Avatar Topic Author
  • Offline
  • Elite Member
  • Elite Member
More
04 Apr 2022 15:25 #239260 by clunc
Cut-to-the-chase: (2) 4GB sticks in complementary slots (-1 and -2); latency 123000 ns base. Somewhat better, but no blue ribbons.
I'm a little slow getting back to you because of an experiment.

I was concerned that the hwinfo report indicated the maximum memory for four slots was 8GB, and that the 4GB parts might be problematic so I tried the simplest configuration first: 2GB in DIMM-1.

Result: no boot.

Quick revert to: 4GB parts in DIMM-1 and DIMM-2 (that is, paired), and booted as before.

The latency-test seems to be maxed out at about 123000 ns with 4 glxgear, 4 glxhead, and firefox open with a dozen tabs and playing one youtube video.

That's an improvement, but I'm going to shutdown and go through the BIOS settings 1x1 using the guide Nebur provided.

I'll be back asap.

@tommylight, I suspected it, and hoped it might be possible, that the machine could be configured for "sub-optimal" performance by increasing the base period, but I hadn't ever seen it in writing.  For the kind of work I do (200 ipm tops, in hardwood), that might be workable for me.
The following user(s) said Thank You: tommylight

Please Log in or Create an account to join the conversation.

More
04 Apr 2022 16:28 #239263 by Nebur
120us jitter is worse than what you posted initially.
Assuming you still have that wonky radeon gpu installed, remove it and go onboard graphics.

Otherwise any positive effect that bios settings or kernel params could provide, will be completely invisible in the 'latency test'. The only observable change would be the shape of the 'latency histogram' (not the max+/- deviations).

Regarding mem config, dual-channel / paired dimms is the only way to go. I wouldn't bother experimenting with that. The only thing that improves latency on core2 is switching memory straps but that's in the realm of 'honing' some microseconds and getting a 'pointy' histogram curve / reducing standard deviation.

Still - I don't see how the initial ~30us latency you had could lead to a stuttering axis. Assuming the base thread timing is adequate for the given latency that shouldn't happen. Also why only in auto mode and not while jogging and why only on one axis?

Please Log in or Create an account to join the conversation.

  • clunc
  • clunc's Avatar Topic Author
  • Offline
  • Elite Member
  • Elite Member
More
05 Apr 2022 03:21 - 05 Apr 2022 03:22 #239299 by clunc
Update: latency-test is down to 33000 ns base. With (4) glxgears, (4) glxheads, youtube with 12 tabs and playing (4) youtube movies. 'latency-histogram --show' attached.  The histogram looks terrible again.

BIOS changes:
1. I went through the BIOS point-by-point again, following Nebur's checklist. I had had S.M.A.R.T. disk-monitoring and Fan-Control Override both Enabled. I disabled them now.
2. Aside: I noticed that several options on Nebur's list were missing from my BIOS; "USB for Flex bay" and "HDD Acoustic Mode" being examples. It may signal a BIOS update is available.

Video change:
1. Removed the PCI-e DVI-D video card, and reverted to motherboard VGA.

Memory changes:
1. Removed the 2 GB stick and put the 4 GB pair in DIMM-3 and DIMM-4 (the blacks: following @tommylight's instruction).

Parallel port:
1. I had early-on disabled the motherboard parport in the BIOS and begun using one of the two parport PCI cards. The tests were run with this configuration.

The off-chart bin counts on the latency-histogram are still bad.
Attachments:
Last edit: 05 Apr 2022 03:22 by clunc.

Please Log in or Create an account to join the conversation.

More
05 Apr 2022 11:00 #239316 by Nebur
Not that I would expect it to necessarily make a difference, but 'fan control override' should be enabled not disabled. Removes the burden from the hw to monitor temp and control the fan.

Please Log in or Create an account to join the conversation.

  • clunc
  • clunc's Avatar Topic Author
  • Offline
  • Elite Member
  • Elite Member
More
05 Apr 2022 11:22 #239319 by clunc
I probably misunderstood. The fan runs on high all the time.

Please Log in or Create an account to join the conversation.

Time to create page: 0.073 seconds
Powered by Kunena Forum