Questions on Latency test script -v- CLI initiated with simulated loads

More
26 Jun 2021 20:17 #212979 by JackW327
This is my puzzle I could use some advice on ...
  • Using Tommy Light's script & after some tweaking I'm still getting max servo latencies from ~75us to ~100us on successive runs - Kinda Sucks
  • But if I just run # latency-histogram with 5 glxgears & the same videos in Firefox for 2-6 hours plus I'm not seeing max latencies above 30us - Not Bad
  • If I run # latency-plot with the same simulated load for 4-5 hours I only see peaks of < 20us within the first 20 - 30 minutes and then it's showing <10us consistently - Pretty Good

One of my earlier tests before some of the tweaks with typical results is here at this link

Note: I will post screen shots once I fix my X environment which I broke trying to install xfce4-screenshooter the wrong way ... sigh ...

I guess I have two questions / concerns at the moment ...

1. If possible I'd like to understand why the differences above -v- just do a bunch of things from other posts (I'm still reading things and maybe that's the only way to get the understanding)
2. Given I will be using a Mesa 7i95 should I even care about the results above right now? Someday, yes, for sure. Right, now I want to get the mill going, clean off the dining room table, and make some parts for other projects.

Background so far:

The PC is a Dell Optiplex 790 (possibly a bad choice, yes). I have a Mesa 7i95. I'm going to have to build the current dev branch of Linuxcnc to get that going, I know. Before I invest more time into this PC I feel like I need to see the latency peaks stay below 100us consistently.
  • BIOS updated to Rev A22 which is the latest available for this machine from Dell
  • Turned off everything in the BIOS that looked like it was management related
  • Only the rear quad USB bank is on
  • The A22 BIOS added a check box to enable the audio, unchecked
  • Disabled all the drives except the HDD
  • The variable fan feature is still on (once off the D/R table and in the machine I can change this so if its a problem I'll do it now)

The mill is a Series I 9 x 50 Bridgeport I converted back in 2005 to a Mach 3 rig. Servo motors. 3:1 reduction on Y&Z & and 1.875:1 on X (I recently switched from a ~1/3hp NEMA 34 to a ~1/2hp NEMA 43 motor on X), .20"pitch ball screws, 500 line HEDS encoders, Gecko G320 servo drives. I have the Mesa encoder signal repeaters. I intend to configure Linuxcnc for closed loop operation.

Thinking out loud ... If I am doing the math right a 100 inch / sec rapid on my X would need ~312 steps per microsecond. On a straight G0 X~ move a 100us delay is no big deal. I might see a glitch. On a G0 X~ Y~ Z~ with the tool engaged to the work I'm not at all sure how a 100us delay would show up. My assumption is to get the smoothest motion possible steps for a G0 are sent round robin to the 7i95 registers with some kind of timing flag. Then the 7i95 is responsible to get the signaling to the drivers synchronously. If so a delay should not really foul up the move. But maybe it's way more complex than that ... I need to start reading code ...

I made the grub changes to fake out the C state. This was after I saw a 10-20us improvement in latency tests by running an endless loop script in a shell at the same time per the advice in one of the trouble shooting pages. But, after more test runs it is not consistently helping.

What's left to try:
  1. IRQ Remapping - ("isolcpus=1") change here
  2. Trying to address System Management Interrupt (SMI) issues here
  3. Not sure what else? Maybe digging into seeing what'd going on with the video drivers ... or going to another distro?

Please Log in or Create an account to join the conversation.

More
26 Jun 2021 21:28 #212984 by tommylight
The script is made by user "seuchato", all credits go to him for the effort put into this.
That is written at the top of the script page, so not my script.
Personally, i think you are overthinking this, wire the Mesa board, make a test config, run the PC with Mesa attached for a day or two, if it does not drop the link = forget latency and start making chips.
Since you spent a lot of time tweaking and tuning, memory modules, HDD/SSD, faulty audio, and a lot of other things affect latency a lot, so:
- if you have more than one DIMM = remove all but one and test = usually no big improvements
- if you have spare HDD/SSD test with it, also test booted from USB = always lower latency compared to HDD = HDD/SSD can have huge impact on latency
- faulty audio will have huge impact on latency, worst case i came across = it took half hour for the laptop to boot and everything else subsequently.

Please Log in or Create an account to join the conversation.

More
26 Jun 2021 23:56 - 27 Jun 2021 00:13 #212986 by JackW327
Sorry about a attribution error! My bad.

Okay, thank you for the advice. I reinstalled to straighten out the X issues 'cause now that I've looked up everything the changes are easier than debugging, I digress. Here's the histogram with no changes to any of the Linux config. I'm going to press on from here and see how it goes. Thanks for the tid bit on watching for the disconnect from the 7i95.



Edit: Removed three of four DIMMs, immediately worse, just FYI for anyone thinking about a 790.

Attachments:
Last edit: 27 Jun 2021 00:13 by JackW327.
The following user(s) said Thank You: tommylight

Please Log in or Create an account to join the conversation.

More
27 Jun 2021 00:29 #212988 by tommylight
Yeah, forgot to mention, anything lower than 4GB will worsen latency. Sorry for that.
Also, add "--show" at the end of latency-histogram line so it shows if there are excursions outside of the visible area ( there are on the second screenshot as can be seen by the colored lines on both sides of the graph.

Please Log in or Create an account to join the conversation.

More
27 Jun 2021 00:49 - 27 Jun 2021 01:54 #212990 by JackW327
No worries at all. Lots to remember ... I forgot that this thing has an SSD in it already. It was an reconditioned machine from an Amazon seller. I put four 4GB DIMMs in it when I got it. I just put #2 back in and it's looking pretty good. Interesting charts ... Okay, well I think I'll run with that until something sends me back to latency issues.

Thanks! Thanks to everyone who's helped create all this too!



Here's another run @ 3426 seconds in. Over time it's normalizing. This is with the --show option.

Attachments:
Last edit: 27 Jun 2021 01:54 by JackW327.

Please Log in or Create an account to join the conversation.

More
27 Jun 2021 04:26 #212997 by BeagleBrainz
No need to test for a base thread if you are using a Mesa Ethernet card.

Run the latency test with this command.
latency-histogram --nobase

The figures on the servo thread as tested are more than suitable.

Please Log in or Create an account to join the conversation.

More
27 Jun 2021 13:31 - 27 Jun 2021 13:32 #213019 by PCW
I would not worry much about the latency test unless the results are horrible
(100s of usec). The latency test basically only tests "dispatch" latency. It does
not allocate memory or access hardware or use the network stack.
The actual latency will be much worse. The important thing is that the servo thread
can run at the required rate (usually 1 KHz) without missing the deadline for the next
invocation. You can see how much margin you have by watching the servo-thread.tmax
value (this is in CPU clocks on X86 systems)
Last edit: 27 Jun 2021 13:32 by PCW.

Please Log in or Create an account to join the conversation.

Time to create page: 0.255 seconds
Powered by Kunena Forum