Advanced Search

Search Results (Searched for: )

  • papagno-source
  • papagno-source
Yesterday 06:34
Replied by papagno-source on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

I have question at AI:

ethercat on debian 10 with kernel 4.19 and linuxcnc has less latency than debian 13 with kernel 6.12. Why?

reply , but only for curiosity :

This is a very insightful observation, and while it may seem counterintuitive (new = better?), in the world of Real-Time (RT) and numerical control (NC) like LinuxCNC, "old and stable" often trumps "new and complex."

The move from a 4.19 kernel (Debian 10) to a 6.12 kernel (Debian 13) introduces several layers of complexity that can degrade deterministic latency. Here are the main reasons:
1. Security Mitigations (The "Price" of CPUs)

Since the 4.19 kernel era, numerous hardware vulnerabilities have been discovered in CPUs (Spectre, Meltdown, L1TF, MDS, etc.).

Kernel 4.19: Many of these mitigations were not yet implemented or were less invasive.

Kernel 6.12: Includes a comprehensive suite of security patches that add overhead to every syscall and every switch between user-space and kernel-space. This increases jitter and maximum latency.

2. Kernel RT Integration (Preempt-RT)

Kernel 4.19: The PREEMPT_RT patch was a very mature set of external changes focused exclusively on pure real-time performance.

Kernel 6.12: Much of the real-time code has now been integrated into the Mainline Kernel. While this is a huge technical achievement, the overall integration must make compromises with other modern kernel features, making it sometimes less "extreme" than dedicated patches of the past.

3. Power Management (C-States and P-States)

Newer kernels are optimized for power efficiency.

Kernel 6.12 aggressively attempts to put CPU cores into power-saving states or vary the frequency.

Waking a CPU from a deep "C-state" requires precious microseconds, which, in EtherCAT communication (which often runs at 1kHz or more), translates into latency spikes.

4. EtherCAT Driver (IgH or SOEM)

The EtherCAT master (usually the IgH EtherCAT Master) must be patched to work with each kernel version.

On older kernels like 4.19, "native" drivers (those that replace the standard network card drivers, e.g., ec_8139too) are extremely stable and fast.

On very recent kernels, you're often forced to use the "generic" driver, which runs through the standard Linux network stack, inevitably adding latency compared to an optimized native driver.

5. Operating System Complexity

Debian 13 loads many more background services, more complex systemd versions, and more intensive D-Bus management than Debian 10. Even though LinuxCNC runs at high priority, background operating system noise can affect determinism.
How can I improve the situation on Debian 13?

If you absolutely must use the latest version, you can try "neutering" modern kernel features to regain speed:

Disable mitigations: Add mitigations=off to the GRUB boot parameters. (Warning: This reduces PC security.)

Core isolation: Use isolcpus=X (where X is the core dedicated to LinuxCNC) to prevent the operating system from using it for generic processes.

Idle=poll: Add idle=poll or intel_idle.max_cstate=0 to prevent the CPU from "sleeping."

Check Drivers: Make sure you don't use the generic EtherCAT driver if your network card supports an updated native driver for the 6.x kernel.

In summary: Debian 10 with kernel 4.19 is a "minimalist" and hardware-friendly combination. Debian 13 is a modern, hardened system optimized for power savings—the exact opposite of what's needed to drive a brushless motor in real time.
  • grandixximo
  • grandixximo's Avatar
Yesterday 06:22
Replied by grandixximo on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

0x92c seem to correlate but in my testing not perfectly, I can still get quick OP with a big 0x92c
So it might not be the only factor, but I agree it is at least one of the relevant factors.
Smaller 0x92c is quicker OP, I think this has been true in my testing
  • rodw
  • rodw's Avatar
Yesterday 06:17
Replied by rodw on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

edit:
tested with new kernel
uname -r
6.3.0-rt11-linuxcnc
same results, almost no effect
44sec for R2M
12sec for M2R
rule out kernel?

Can't be the kernel then, It was worth a try. When Debian 12 came out on the 6.1 kernel, we were all trying to work out why latency was so bad, but now the factors are well known so now we know updating the kernel no longer helps. (Trixie I have found is better than Bookworm)

@hakan, I read somewhere (maybe even on this thread) the 5000ms was just a waening, and not really something to be concerned about.
  • papagno-source
  • papagno-source
Yesterday 06:12 - Yesterday 06:27
Replied by papagno-source on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

Good morning everyone. Today I'll try the latest Grandixximo patch.
I can assure you that all PDOs, both on the I/O side of the first slave and on the drives, are read immediately. The same is true on another machine, where in addition to the drives there are two Delta inverters.

I tried running the following commands on Debian 10:
ethercat debug 1
start linuxcnc
exit or leave running, it doesn't matter.
sudo journalctl --since "2 min ago" | grep "Sync after"

no results.

uname -r 4.19.0-27-rt-amd64
dpkg -l | grep linuxcnc-uspace : no results
dpkg -l | grep ethercat: no results

I checked, and if I don't start axis, all the slaves are in PREOP.

The OP state and writing to the slave pins only occurs immediately after running addf lcec.write-all servo-thread in hal.

Just for completeness of information. On Debian 10, as well as on Trixie, I am using the "generic" device in the ethercat.conf file.
  • 3404gerber
  • 3404gerber
Yesterday 06:08
Replied by 3404gerber on topic retrofitting a Proxon for coin die milling

retrofitting a Proxon for coin die milling

Category: Milling Machines

As rodw said, try to add isolcpus=3 or even isolcpus=2,3 in your /boot/firmware/cmdline.txt

Also, I did some tests with Remora over ethernet on a PI400 and you should follow rodw's video to adjust the ethernet connection to your mesa card.
  • Hakan
  • Hakan
Yesterday 05:56 - Yesterday 06:17
Replied by Hakan on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

The code that handles the initial synchronization in the EtherCAT master is 16 years old, except for a variable name change 8 years ago
gitlab.com/etherlab.org/ethercat/-/blame...e=heads&page=2#L1480

There are people with this "Did not sync after 5000 ms." problem but no real solution found.

I am sure the x092c register shows the symptoms of the problem.
System Time Difference
 

This have me believing that either
- the slaves' clocks are drifting badly,
- the initial time the slaves get from lcec is not a good choice,
- or both.

Watching the 0x92c register during start, I see often, not always, start at 1000000 ns which is 1 ms.
This is with 1 ms linuxcnc servo loop time. When I go down to 0.5 ms servo loop time, 0x92c often starts at 500000 ns,
so there is a possible a correlation.
This initial 1 or 0.5 ms value in 0x92c happens when I start linuxcnc directly and when I let it sit over night.
This points to that it isn't a slave clock issue, but rather the initial time from lcec that is the problem.

Edit: The startup time for my lcec rt_app_main isn't that bad, its 0.85-0.9 secs. Forgot a test-second I had.

Edit 2: Here is a graph if how slave 1 and slave 2 reduce their time differences. This is from journalctl, 05 ms servo loop time.
 
  • Todd Zuercher
  • Todd Zuercher's Avatar
Yesterday 05:40
Replied by Todd Zuercher on topic Using Offsetpage Widget?

Using Offsetpage Widget?

Category: GladeVCP

Here is what it looks like.  All offsets show zero, even though they are not zero.

 
  • grandixximo
  • grandixximo's Avatar
Yesterday 05:00 - Yesterday 06:02
Replied by grandixximo on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

uname -r 6.12.74+deb13+1-rt-amd64
dpkg -l | grep linuxcnc-uspace
ii linuxcnc-uspace 1:2.9.4-2 amd64 motion controller for CNC machines and robots
dpkg -l | grep ethercat
ii ethercat-dkms 1.6.9.gb709e58-1+28.1 all IgH EtherCAT Master
ii ethercat-master 1.6.9.gb709e58-1+28.1 amd64 IgH EtherCAT Master
ii libethercat 1.6.9.gb709e58-1+28.1 amd64 IgH EtherCAT Master
ii linuxcnc-ethercat 1.40.0.g8a607c0-0 amd64 LinuxCNC EtherCAT HAL driver

I have done testing with the above, this is Debian 13 everything installed from repos.
Generic drive no ec_igc
20 devices
9 servos DC
1 VFD DC
10 I/O not DC
the 10 I/O always OP instantly
I've tried to keep an eye on
watch -n0 "ethercat reg read -p2 -tsm32 0x92c"
My start up quite randomly, within a range of + or - 3 million, and does not always perfectly correlate with how quickly the servos OP, but they seem more often than not to OP quickly when this start as a small value, but it does not correlate perfectly.

with refClockSyncCycles="1" from now on this I will call R2M
I get 40s to all OP, so about 4s per device, but some take 2s some 6s, not a constant each one 4s.
with refClockSyncCycles="-1" from now on this I will call [/code][/code]M2R
[code][code]I get 12s to all OP, so about 1.2s per device, again not constant, some quick some slow, but overall much faster.

I have a 1khz servo loop 1000000ns
@Hakan when you say 2khz you mean 500000ns correct?

anyway that is with Scott's code, nothing of mine, I will do further testing with your referenced commit, and then test with my changes, and different kernels see if linuxcnc-ethercat -1 and +1 are the bigger factors here or not.

edit:
tested with new kernel
uname -r
6.3.0-rt11-linuxcnc
same results, almost no effect
44sec for R2M
12sec for M2R
rule out kernel?
  • Todd Zuercher
  • Todd Zuercher's Avatar
Yesterday 04:32
Replied by Todd Zuercher on topic Using Offsetpage Widget?

Using Offsetpage Widget?

Category: GladeVCP

I booted the current live image and created a new one like I had at work.

 

File Attachment:

File Name: offsets.ui
File Size:1 KB
  • Todd Zuercher
  • Todd Zuercher's Avatar
18 Apr 2026 03:18
Replied by Todd Zuercher on topic Using Offsetpage Widget?

Using Offsetpage Widget?

Category: GladeVCP

It's just a window with the Offsetpage widget. I don't have a copy here at home but (it's at work.)
  • rodw
  • rodw's Avatar
18 Apr 2026 03:10
Replied by rodw on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

Thanks for clarifying. I guess if 1.5.2 does not solve the issue, the next question is has linuxcnc-ethercat changed since Debian 10? I suspect not but can you review and confirm? If nothing has changed, its a Debian kernel issue

People forget that modern kernels post 5.9 and up  (which is well after Debian 10) are enormously complex and the hardware developers have been adding  features for years (mostly power saving) that the linux kernel unfortunately now supports. So there are a lot of tuning steps now. Perhaps there are some issues we have not covered in our known NIC tunings.

I've shared a 6.3 kernel I built in the past on Discord which many people had good results with. Perhaps it will help. It will be a few days before I can build a later kernel for Trixie.


 
  • grandixximo
  • grandixximo's Avatar
18 Apr 2026 02:03 - 18 Apr 2026 02:06
Replied by grandixximo on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

For somebody trying hard to follow along from the sidelines, can you guys outline the objective for rolling back to 1.52 here?
Has anybody looked at the Igh Ethercat master 1.6 or even 1.7 which was released on gitlab 3 weeks ago (but not in their repositories) to see if it has these delays outside of linuxcnc.

It seems to me it would be unlikely that iGh master 1.6 or 1.7 are problematic as people would be complaining at iGh so to me,  the effort should be in understanding what has changed that linuxcnc-ethercat is not liking. Rolling back to a very old version does not seem a progressive step following the synchronisation enhancement.


Did it on pagagno-source request

Why don't we try installing Ethercat 1.5.2 on Trixie, so as to understand if the problem is on the kernel or Ethercat side.


This is just for testing, I pretty much knew it was a dead end when I started on this path, but it is just so we are all on the same page, I need to bring receipts, I won't simply hand wave his concerns, he wanted 1.5.2 on Trixie so that he will be convinced that is not the issue, and I gave him that. I don't want to come back to this point over and over again as we proceed with more testing, once it is proven that 1.5.2 does not solve the issue, we can move on to the next thing. Just trying to respect other people's perspective, call it an unnecessary waste of time if you want, I can see where you are coming from, but I feel the best approach is moving systematically one step at the time, clearing up all concerns, testing thoroughly, find agreement, move to the next thing.

1.6 and 1.7 present the same OP time delays in my testing.
  • cmorley
  • cmorley
18 Apr 2026 01:28
Replied by cmorley on topic Using Offsetpage Widget?

Using Offsetpage Widget?

Category: GladeVCP

Can you post your glade file?
  • grandixximo
  • grandixximo's Avatar
18 Apr 2026 01:17 - 18 Apr 2026 01:20
Replied by grandixximo on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

Tested a bit.
I get fairly consistent loading times of lcec's rt_app_main() function of 1.85-1.90 seconds.
But but. In syslog I can see that the initial sync of slaves takes anywhere between zero and
say 3,4,5 seconds. 
To check
ethercat debug 1
start linuxcnc
exit or leave running, doesn't matter.
sudo journalctl --since "2 minutes ago" | grep "Sync after"
and see where it stops for every slave.
I reckon this is the problematic delay time.
Seldom, but it happens, I get "Checking for synchrony" and then it is satisfied directly and moves on.

It seems measuring the time of lcec's rt_app_main is not a good indicator.
Is there a good way of measuring time to OP?

trixie, 
Linux plasma 6.12.69+deb13-rt-amd64 #1 SMP PREEMPT_RT Debian 6.12.69-1 (2026-02-08) x86_64 GNU/Linux
IgH EtherCAT master 1.6.8 1.6.8.g2543cc5-1+27.3  (via apt)
linuxcnc-ethercat, a slightly modified version of commit bf13577ec7b1b54844e07a80d949232e71776b32
linuxcnc 2.9.4 (via apt)


At the end of my hal file I
loadusr halshow
and I save the default watch to display the OP pins, you can have master all-op, and each slave individual OP pin, I count Mississippies or use stopwatch...
Another alternative is
watch -n 0.2 ethercat sl
But this might "dirty" the bus, may effect OP time, maybe...
To get more accurate times might have to build a script, or use dmesg with some debug from ethercat.
But I think the difference papagno-source is talking about can be perceived with less precise tooling, in one case everything OP even before the UI opens, and in the other he has to wait some amount of time, so it is a difference in the seconds range.
I will try to setup similar setup to yours, and check the OP times.
  • rodw
  • rodw's Avatar
18 Apr 2026 00:49
Replied by rodw on topic Ethercat random jitter fix

Ethercat random jitter fix

Category: EtherCAT

For somebody trying hard to follow along from the sidelines, can you guys outline the objective for rolling back to 1.52 here?
Has anybody looked at the Igh Ethercat master 1.6 or even 1.7 which was released on gitlab 3 weeks ago (but not in their repositories) to see if it has these delays outside of linuxcnc.

It seems to me it would be unlikely that iGh master 1.6 or 1.7 are problematic as people would be complaining at iGh so to me,  the effort should be in understanding what has changed that linuxcnc-ethercat is not liking. Rolling back to a very old version does not seem a progressive step following the synchronisation enhancement.
Displaying 46 - 60 out of 17105 results.
Time to create page: 0.422 seconds
Powered by Kunena Forum