Cannot seem to get good latency numbers!

More
15 Aug 2022 19:19 #249809 by Sray69
I was really looking forward to using LinuxCNC but it seems too difficult to find a computer setup that will work well with it. For the life of me I cannot get good latency numbers no matter what I do. I can't continue to waste so much time trying to get different computers to work with it and I especially cannot afford to be buying computers/components in hopes they will work, only to find out they do not. I am surprised in this day and age just how challenging it is to determine which computer/components will work well with LinuxCNC.

Don't get me wrong, I am not saying anything bad about this software. I am sure it is amazing, and that is why I was wanting to use it, but if you are not lucky enough to have the perfect combination of computer components it can be a very frustrating, time consuming and possibly expensive to get a system that works well with LinuxCNC.

I have read through all the wiki stuff and the compatible computer lists and yet I am no closer to having an easy solution. I would love if someone were able to tell me a specific computer model what WILL perform well, that I can find easily at a reasonable cost.

My latest computer test:

MB: ASUS M5A78L-M/USB3
CPU: AMD FX-6300 Six-core Processor 3.5 GHz
RAM: 8GB DDR3-1600 (800MHz clock) PC3-12800
Controller card: Mesa 7i76e

I turned off everything (except ACPI)in the bios that I could and this is my latency numbers
Max Jitter:
Servo: 2756040
Base: 163095

If I turn off ACPI V2 but leave ACPI APIC enabled I get
Max Jitter:
Servo: 1031871
Base: 90540
The problem with this is that things are a little slow/unstable Like dragging windows around and webpages loading, etc.

If I turn off all ACPI or turn on ACPI V2 and turn off APIC then the computer lags really bad until it crashes.
The reason I mention ACPI is that I see the best numbers by disabling them but the computer is unstable.
All these tests were ran with two browsers open and multiple videos playing and pages open. I would bounce around and scroll and drag and reload things.

I also noticed that by switching from DVI to D-sub for the monitor that the numbers looked much better but the computer seemed a little slow and unstable. Definitely better performance with DVI.

Any help would be appreciated.



 

Please Log in or Create an account to join the conversation.

More
15 Aug 2022 19:45 #249814 by PCW
You might try running the test without a base thread.

The default base thread of 25 usec can easily swamp
systems that don't have great latency (and the base
thread will not be used with the 7I76E anyway)
For example:

latency-histogram --nobase --sbinsize 5000


Also are you sure you are running a real time kernel?

What are the results of the command:

uname -a

?

Please Log in or Create an account to join the conversation.

More
15 Aug 2022 22:22 #249828 by Sray69
Here are the results.
latency-histogram --nobase --sbinsize 5000
File Attachment:


Also are you sure you are running a real time kernel?
Note: Using POSIX realtime

What are the results of the command:

uname -a
Linux LinuxCNC 5.18.0-3-rt-amd64 #1 SMP PREEMPT_RT Debian 5.18.14-1 (2022-07-23) x86_64 GNU/Linux

Please Log in or Create an account to join the conversation.

More
15 Aug 2022 22:28 #249829 by Sray69
Tried to get the image to show in my post but cannot figure it out.

Anyway, I was running multiple browsers with multiple videos running and glxgears and open webpages. The histogram reflects running all this for a couple hours.

Please Log in or Create an account to join the conversation.

More
15 Aug 2022 22:51 #249833 by PCW
That should be fine.

Please Log in or Create an account to join the conversation.

More
15 Aug 2022 23:32 #249844 by Sray69
Will there be any bottlenecks with this setup? I tried to purchase components for my CNC that would perform fairly well. Since I am not sure how to read any of these results I just want to make sure that LinuxCNC does not hold back the rest of the machine from performing optimally.

Please Log in or Create an account to join the conversation.

More
16 Aug 2022 00:02 #249845 by PCW
Latency with Mesa hardware does not have much effect on performance.
As long as you can run a 1 KHz or so servo thread without timeouts
it should be fine.
The following user(s) said Thank You: Sray69

Please Log in or Create an account to join the conversation.

More
16 Aug 2022 04:16 #249850 by tatel
I have seen quite a bunch of posts from you, there are some things that I've noticed

1- Kernel boot parameters

Even if your BIOS/UEFI doesn't allow you a great deal of options, you can try some boot line kernel parameters to disable, say, hyperthreading and c-states. Those boot parameters have to be written into /etc/default/grub, line GRUB_CMDLINE_LINUX_DEFAULT=.

After saving changes, you must run update-grub and reboot the machine

"nosmt" will disable hyperthreading even if your BIOS doesn't offer that option to you.

"intel_idle.max_cstate=0" will disable intel_idle kernel module. This kernel module deals with C-states. You'll probably need to use also "processor.max_cstate=0" to effectively disable C-states now that intel_idle kernel module isn't working anymore.

"processor.max_cstate=0" will disable any c-states on non-intel machines. No need to worry about intel_idle module on non-intel systems

"idle=poll" can be tested after "intel_idle.max_cstate=0" and "processor.max_cstate=0" have been proven not to work for you. This parameter makes the processor unable to get into any c-state; it will not be idling but polling instead. Processor will run hotter, however. So, use it as last resource.

2- Now, some other boot parameter that have nothing to do with the CPU idling but are/could be useful. I've seen that you are not using isolcpus parameter.

isolcpus=a,b,c,...x,y,z will isolate cores a,b,c,...x,y,z thus avoiding the linux scheduler to send any tasks to them. Well, allmost. So, in a dual-core machine, you would use typically "Isolcpus=1" and you could use "isolcpus=2,3" or even "isolcpus=1,2,3" on a quad-core machine. Linuxcnc and the latency tests know about isolcpus= parameter, as you can see in latency-histogram window. So you can usually start any latency test or linuxcnc in the usual way, even while using this parameter. I know that the newest advice is not to use it, but in my experience it works like a charm. Your latency could drop by orders of magnitude and it could even be the only way to get a usable system. Why not to use it?

"acpi_irq_nobalance" instructs acpi not to balance the system load by changing the cores where IRQs are addressed to. It may or may not work for you, but you can give it a try.

"nowatchdog" will disable both hard and soft kernel watchdogs, which will be triggered if you are overloading your system while testing latency. So, if your system load is high while testing, you'll get probably worse latency because a) system is overloaded in the first place, and b) if the watchdog detects there are processes waiting too much, it will throw a warning, that by itself will make latency perhaps some tens of microseconds worse. This is not the reason you are getting so bad latency numbers, however. But I find useful to have this parameter in force while testing. and usually forget about deleting it while in production. No harm here, it seems to me.

You can read about 4.19 kernel parameters here:

www.kernel.org/doc/html/v4.19/admin-guid...rnel-parameters.html

Please note that both the RT and RTAI kernels may disable many things at build time so things like cpufreq governor, p-states, et al, could have no viable application. You can read about, say, nohz_full paremeter but you should check, before testing, if those things were disabled in the kernel at build time. Kernel configuration is under /boot directory. So, if you want to know if your running kernel have, say, no_hz capabilities, you could run:

grep -i no_hz /boot/config-`uname -r`

On Debian Wheezy with linuxcnc 2.7 it will return:

CONFIG_RCU_FAST_NO_HZ=y
CONFIG_NO_HZ=y

The "y" meaning that capability is built into the kernel and used by default
If that command doesn't give you any answer, it means that the parameter/capability doesn't exist as you named/typed it. You could also get an answer" XXXX is not set", meaning it's disabled

By the way, the nohz parameter is on by default. It's about local timer interrupts. It could be that, by disabling those interrpts, that core would be idling, and to get that core up again, some microseconds would be lost, making latency worse. So, you can give "nohz=off" a try, to see if it makes your latency worse or better.

3- smp_affinity

smp-affinity decides which cpu/core your tasks are addressed to. If you are isolating a core, you don't want any tasks other than your realtime task sent to that core. It can't be fully done (yet) but you want as few as possible processes/threads going on that core. Also, generally speaking, you don't what that core dealing with any system interrupts.

isolcpus= may not be enough to get an isolated core, actually I have seen IRQs sent to supposedly isolated cores. I don't know exactly how it happens. I mean, I didn't sent anything to them and, since isolcpus is in force, the kernel shouldn't have done it, either. But there they are, and you want them out of the isolated core where your realtime task is to run.

You could run this command in a terminal window while testing:

watch -n 1 -d cat /proc/interrupts

This could be the output:

CPU0 CPU1
0: 39 0 IO-APIC-edge timer
1: 68 9 IO-APIC-edge i8042
7: 1 0 IO-APIC-edge parport0
8: 0 1 IO-APIC-edge rtc0
9: 0 0 IO-APIC-fasteoi acpi
12: 14041 135 IO-APIC-edge i8042
16: 0 422 IO-APIC-fasteoi ohci_hcd:usb2, pata_atiixp, snd_h
da_intel
17: 219362733 304 IO-APIC-fasteoi ohci_hcd:usb3, ohci_hcd:usb5, rad
eon
18: 0 0 IO-APIC-fasteoi ohci_hcd:usb4, ohci_hcd:usb6
19: 117464 38 IO-APIC-fasteoi ehci_hcd:usb1, eth0
21: 56 11 IO-APIC-fasteoi
22: 200833 3229 IO-APIC-fasteoi ahci
NMI: 0 0 Non-maskable interrupts
LOC: 32572366 31318 Local timer interrupts
SPU: 0 0 Spurious interrupts
PMI: 0 0 Performance monitoring interrupts
IWI: 0 0 IRQ work interrupts
RTR: 0 0 APIC ICR read retries
RES: 2173 18 Rescheduling interrupts
CAL: 0 683 Function call interrupts
TLB: 0 0 TLB shootdowns
TRM: 0 0 Thermal event interrupts
THR: 0 0 Threshold APIC interrupts
MCE: 0 0 Machine check exceptions
MCP: 261 261 Machine check polls
ERR: 1
MIS: 0

As you can see, now you have a detailed, updating list about where your interrupts are going. So, 1, is your keyboard, and if you press any keys, you should see how the number changes under CPU 0 or CPU 1, or whatever it would be in your case. On this example, you can see that the radeon graphic card on IRQ 17 has sent 219362733 interrupts to CPU 0, but just 304 to CPU 1 (and that was while booting).

Of course, if you are using isolcpus=1, probably you don't want any system IRQ going to CPU 1, but to CPU 0 where they doesn't mess the realtime task. The output of this command will show you what's happening in you CPU's cores and will give you clues about what could be the cause of your bad latency.

Of course, if you use isolcpus=, commands like top or htop will show the isolated core has 0% load. This is because neither top nor htop show kernel threads unless instructed to do so. But if you do instruct them to do so, you'll see threads created by your latency test or linuxcnc in the isolated core. Not that realtime thread is going to overload that core, however. Far from it.

Now, as I understand it, the trend about not isolating cores is about not wasting those cores' processing power. But, hey, what I care about is latency. So, if I can get better latency by isolating a core, sign me up.

Actually, isolcpus is deprecated by linux kernel developers. The advice is to use cpuset capabilities. But, isolcpus still works, is easier, specially for newbies, and I'm willing to have it in use as long as it remains useful. YMMV.

You can have your smp_affinity set to 1 automatically at the end of the boot process by using this in your /etc/rc.local file. However it could perhaps not work for you. So it will probably be better to use it as an script launched manually while testing; then, in case it's proven useful to have the smp_affinity set to 1, you can put it in /etc/rc.local to have it in force after booting the system

###################################################
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.

for i in /proc/irq/*;do
if [ -d $i ];then
/bin/echo 1 > $i/smp_affinity || true
else
/bin/echo 1 > $i || true
fi
done

exit 0
############################################

This can be used on Debian Wheezy with no problems. It works also on Buster where systemd is there to mess things if they are not of his liking. So on Buster after putting it in your /etc/rc.local, you need to do:

systemctl daemon-reload
systemctl start rc-local # yes rc-local not rc.local. Ask systemd about it

After that it should always be in force after booting the system. You can check it with

systemctl status rc-local

4- Your graphics card.

It's usually said that integrated graphics are bad, however I could give some examples where this doesn't hold true. Which remains as true to me, however, is: nvidia graphics are bad. May be this will change in the future, now that a free driver has appeared. We will see. Anyway, you could get an used, 30 bucks radeon card and have some testing. I always test latency with and without a discrete radeon card.

5- Latency-histogram and variance errors

This happens to me with HP-Compaq dc5750 and dc5850. When this happens, I can see a call trace in dmesg output about MSI messaging and smp_affinity. Sometimes latency-histogram works fine, sometimes i got those pesky variance error messages. You should check your dmesg output looking for traces.

6- Machines with known good latency.

To me, a couple dual-core, Intel-based, MEDION-branded machines gave me good results, in the 15-20 usecs range.

Recently I got a bunch of HP-Compaq machines, dc5750 MY and dc5850 MT, with radeon integrated graphics, that give me <3 usec latency with 2.7 Wheezy, about 10 usecs with 2.8-RTAI (launching with taskset 2 latency-histogram - taskset 2 linuxcnc to work around the MSI quirk already mentioned) and about 60 usecs with 2.8-RT (using isolcpus=1) Those are athlon X2 dual-core, bussines systems and can be found dirty cheap in used but good condition

And, last year, I got from Ebay a quad-core Dell Optiplex 960 for about 100 bucks giving about 10,15 usec latency using isolcpus=1,2,3. These Optiplex are also bussines systems and AFAIK there is quite a bunch of them on Ebay.

Systems that I have heard of recently but not tested personally:

Dell Optiplex 980, which is said to be giving about 40 usecs latency without any kernel parameters whatsoever, just disabling things on BIOS, and judging by the kernel mentioned, I guess it was using 2.8-RT. You can see it on this thread that, BTW, is about variance errors in latency-histogram output:

forum.linuxcnc.org/38-general-linuxcnc-q...ils-on-latency-tests

Celeron J4125, about 150-200 bucks (new) in ebay/amazon. See:

forum.linuxcnc.org/18-computer/44740-celeron-j4125

7- No base thread needed

As already posted, if you are going to use a MESA/Pico card, you don't need the base thread and should run your tests with something like this

latency-test 1000000 -

or

latency-histogram --nobase -sbinsize 1000

8- Stick to the stable release until you gain experience with linuxcnc, or else you'll get probably frustrated. Leave the new, experimental stuff to the developers. Until now, you are just dealing with linux latency. You thought this would be an easy ride? Welcome to the club. Please remember that dealing with hal is another can of worms. LinuxCNC is very powerful, flexible software and has a learning curve. I'm glad I got a Sherline mill that has a working linuxcnc conf out of the box, or else I would probably commited seppuku some years ago. Even if I was used to build rt-preempt kernels when it bas just an out-of-mainline set of patches a couple of decades ago.

Best wishes
The following user(s) said Thank You: macsddau@linuxcnc, mdm55, Sray69, Stanislavz

Please Log in or Create an account to join the conversation.

More
16 Aug 2022 04:30 #249851 by Sray69
WOW! Thanks tatel for this great info. I am gonna need to find some time to go through it all and run some tests. I can already tell this is exactly what I was wanting to know. I will post my testing results.

THANKS!

Please Log in or Create an account to join the conversation.

More
16 Aug 2022 06:37 #249856 by Sray69
I tried an old Radeon video card I had but the fan was locked up and the screen was all messed up. I found this card online for $14. Would it work well?

Dell AMD Radeon R5 240 1GB DDR3 DVI/ D-Port Video Card F9P1R 0F9P1R

Not sure exactly what to look for.

Please Log in or Create an account to join the conversation.

Time to create page: 0.104 seconds
Powered by Kunena Forum