EtherCAT Comm Sync error (83.3) when exiting LinuxCNC

More
02 Feb 2022 20:25 #233834 by arvidb
Every time I exit LinuxCNC my drives error out with Error 83.3 "Communications Synchronization Error". (Which also triggers my E-stop chain.) It only happens when quitting LinuxCNC, not when disabling the amps or hitting LinuxCNC's E-stop.

This is pretty annoying not the least since I have to shut down LinuxCNC to be able to write new tuning values to the drive (drives in OP mode won't accept SDO data it seems).

I haven't tried to troubleshoot it yet, just wanted to hear if it's only my drives or if this is a common thing? And certainly ideas to fix it are welcome!

Please Log in or Create an account to join the conversation.

More
03 Feb 2022 01:22 #233852 by Grotius
Hi,

I looks like a "./halrun -U" where i think the ethercat component and all other's are trowed away.
When quiting lcnc, try first in halrun to unload the ethercat component and look how this reacts.
 

Please Log in or Create an account to join the conversation.

More
03 Feb 2022 14:53 #233882 by arvidb
Hi Grotius,

'halrun -U lcec' kills LinuxCNC and causes the same 83.3 error on the drives.

LinuxCNC console output after executing the halrun command:
task: 5990 cycles, min=0.000011, max=0.017527, avg=0.010041, 0 latency excursions (> 10x expected cycle time of 0.010000s)
MOTION: cleanup_module() started.
MOTION: cleanup_module() finished.
Traceback (most recent call last):
  File "/home/arvidb/src/linuxcnc-dev/bin/axis", line 4192, in <module>
    o.mainloop()
  File "/usr/lib/python3.7/tkinter/__init__.py", line 1283, in mainloop
    self.tk.mainloop(n)
  File "/usr/lib/python3.7/tkinter/__init__.py", line 1700, in __call__
    def __call__(self, *args):
KeyboardInterrupt
Shutting down and cleaning up LinuxCNC...
USRMOT: ERROR: command timeout
shmctl(2850833, IPC_STAT, ...): Invalid argument
Note: Using POSIX realtime
LinuxCNC terminated with an error.  You can find more information in the log:
    /home/arvidb/linuxcnc_debug.txt
and
    /home/arvidb/linuxcnc_print.txt
as well as in the output of the shell command 'dmesg' and in the terminal

So nothing unexpected or useful, unfortunately.

Please Log in or Create an account to join the conversation.

More
04 Feb 2022 01:23 #233937 by Grotius
Hi,

I meant it behaves like a "halrun -U". And it seems so when you get the same error.

My next step in debugging would be :
Try to unloadrt your ethercat component with a running lcnc environment.  If this goes ok. You are one step ahead.

My thought of the cause at the moment :
Somewhere there is a request to entirely shut down your hal environment. But where?



 
The following user(s) said Thank You: arvidb

Please Log in or Create an account to join the conversation.

More
04 Feb 2022 07:23 #233948 by db1981
This is normal behaviour !

If you stop the control/plc/linuxcnc the masters rt thread is shutdown, so all slaves went from Ethercat OP state to preop/init back again. Syncerror is dropped if an slaves dc clock mode has been configured and the dc master is shutdown.

this can't be changed.....

What drives are this? generic or one of the coded drivers?

If you are an C- Person, in this branch i added the function for read/write sdos while the rt thread is running "class_rt_sdo" you can see the usage in the nanotec pd4e driver.

github.com/steup-engineering/linuxcnc-et...ree/add-nanotec-pd4e
The following user(s) said Thank You: arvidb

Please Log in or Create an account to join the conversation.

More
04 Feb 2022 08:04 - 04 Feb 2022 08:09 #233950 by Hakan
It doesn't happen here. I don't know if the conditions you mention with the dc clock are met here
Linuxcnc exits cleanly here, no errors like that.. There are some messages in "dmesg" though.
 
Attachments:
Last edit: 04 Feb 2022 08:09 by Hakan.
The following user(s) said Thank You: arvidb

Please Log in or Create an account to join the conversation.

More
04 Feb 2022 09:00 #233958 by db1981
this is the same behaviour like arvidb descripted in his first post.

the only difference could be that his drives latches the errorstate on the drive himself, and need an extra "error reset" before the next start.

Please Log in or Create an account to join the conversation.

More
05 Feb 2022 00:37 #234002 by arvidb

My next step in debugging would be :
Try to unloadrt your ethercat component with a running lcnc environment.  If this goes ok. You are one step ahead.

I should have checked what 'halrun -U' actually does before replying to you: I confused it with the unloadrt function!

Anyway, trying 'halcmd unloadrt lcec' gave very interesting results:
$ halcmd unloadrt lcec
$ ethercat slaves
0  0:0  PREOP  +  R88D-KN04H-ECT G5 Series ServoDrive/Motor 

I.e. the drive goes back to PREOP mode - and without any errors! In this state LinuxCNC continues to run, but obviously without any feedback from the servo.

And what's more, quitting LinuxCNC in this state doesn't trigger any error either! So you are clearly onto something here. Thanks!
The following user(s) said Thank You: Grotius

Please Log in or Create an account to join the conversation.

More
05 Feb 2022 00:46 #234004 by arvidb

This is normal behaviour !

If you stop the control/plc/linuxcnc the masters rt thread is shutdown, so all slaves went from Ethercat OP state to preop/init back again. Syncerror is dropped if an slaves dc clock mode has been configured and the dc master is shutdown.

this can't be changed.....

Well... unless my so far one and only test described above was a fluke it seems there is a way to shut down the bus in an orderly fashion after all?

What drives are this? generic or one of the coded drivers?

It's an Omron G5 drive with the generic driver. I'll attach my lcec config file (very much a work in progress) to this post.

If you are an C- Person, in this branch i added the function for read/write sdos while the rt thread is running "class_rt_sdo" you can see the usage in the nanotec pd4e driver.

github.com/steup-engineering/linuxcnc-et...ree/add-nanotec-pd4e

Cool, I'll check it out!
Attachments:

Please Log in or Create an account to join the conversation.

More
05 Feb 2022 00:54 #234005 by arvidb
A little more testing: I added a hal shutdown script with 'unloadrt lcec' and that seems to be enough to prevent the Sync Error when quitting LinuxCNC. I do get this error sometimes on quitting though:
Shutting down and cleaning up LinuxCNC...
Running HAL shutdown script
Failed to receive: Inappropriate ioctl for device
Failed to process domain: Inappropriate ioctl for device
rtapi_app: caught signal 11 - dumping core

rtapi_app: caught signal 11 - dumping core
task: 1102 cycles, min=0.000012, max=0.020384, avg=0.009946, 0 latency excursions (> 10x expected cycle time of 0.010000s)
USRMOT: ERROR: command timeout
Waited 3 seconds for master.  giving up.
Note: Using POSIX realtime
mux2: not loaded
<commandline>:0: exit value: 255
<commandline>:0: rmmod failed, returned -1
Note: Using POSIX realtime
cia402: not loaded
<commandline>:0: exit value: 255
<commandline>:0: rmmod failed, returned -1
Note: Using POSIX realtime
motmod: not loaded
<commandline>:0: exit value: 255
<commandline>:0: rmmod failed, returned -1
Note: Using POSIX realtime
trivkins: not loaded
<commandline>:0: exit value: 255
<commandline>:0: rmmod failed, returned -1
<commandline>:0: unloadrt failed
Note: Using POSIX realtime

Not very nice, but I don't know if it's actually causing any problems? Nice not to have to reset the e-stop chain on every restart though...
The following user(s) said Thank You: Grotius

Please Log in or Create an account to join the conversation.

Time to create page: 0.094 seconds
Powered by Kunena Forum