milltask died on signal 11, after removing positions.txt OK again

More
06 Jun 2017 22:21 #94206 by DaBit
LinuxCNC master, RIP build from source, running on a Raspberry Pi.

I encounter something weird. Once in a while the LinuxCNC on my Pi refuses to start, and dies with '/home/pi/linuxcnc-dev/bin/milltask (pid 9090) died on signal 11, backtrace stored in /tmp/backtrace.9090' (or another PID).

Full startup output:
pi@raspberrypi:~/linuxcnc/configs/3dprinter $ linuxcnc 3dprinter.ini
LINUXCNC - 2.8.0~pre1
Machine configuration directory is '/home/pi/linuxcnc/configs/3dprinter'
Machine configuration file is '3dprinter.ini'
Starting LinuxCNC...
(time=1496785657.825587,pid=9061): Registering server on TCP port 5005.
(time=1496785657.826260,pid=9061): running server for TCP port 5005 (connection_socket = 3).
iocontrol: machine: '3dprinter'  version 'unknown'
emc/iotask/ioControl.cc 768: can't load tool table.
Found file(REL): ./3dprinter.hal
Note: Using POSIX realtime
loading dspin module
Debug: euid: 1000 uid 1000
EUID is not 0 (root). Trying seteuid()...
euid: 0 uid 1000
joint 0 config register: 2e88
joint 1 config register: 2e88
joint 2 config register: 2e88
joint 3 config register: 2e88
task: machine: '3dprinter'  version 'unknown'
/home/pi/linuxcnc/configs/3dprinter/3dprinter.ini:24: executing 'import sys
sys.path.insert(0,"python")'
PythonPlugin: Python  '2.7.9 (default, Sep 17 2016, 20:55:23)
[GCC 4.9.2]'
emcTaskOnce: Python plugin configured
emcTaskOnce: extract(task_instance): KeyError: ('task',)

emcTaskOnce: no Python Task() instance available, using default iocontrol-based task methods
emcTrajSetJoints(4) returned 0
emcTrajSetAxes(4, 15)
emcTrajSetUnits(1.0000, 1.0000)
emcTrajSetVelocity(0.0000, 80.0000) returned 0
emcTrajSetMaxVelocity(160.0000) returned 0
emcTrajSetAcceleration(999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.0000) returned 0
emcTrajSetMaxAcceleration(999999999999999967336168804116691273849533185806555472917961779471295845921727862608739868455469056.0000)
emcTrajSetHome(0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000) returned 0
emcJointSetType(0, 1)
emcJointSetUnits(0, 1.0000)
emcJointSetBacklash(0, 0.0000) returned 0
emcJointSetMinPositionLimit(0, -130.0000) returned 0
emcJointSetMaxPositionLimit(0, 130.0000) returned 0
emcJointSetFerror(0, 0.5000) returned 0
emcJointSetMinFerror(0, 0.5000) returned 0
emcJointSetHomingParams(0, -125.0000, -119.0000, -1.0000, -50.0000, -5.0000, 0, 1, 0, 2, 0) returned 0
emcJointSetMaxVelocity(0, 400.0000) returned 0
emcJointSetMaxAcceleration(0, 5000.0000) returned 0
emcJointActivate(0) returned 0
emcJointSetType(1, 1)
emcJointSetUnits(1, 1.0000)
emcJointSetBacklash(1, 0.0000) returned 0
emcJointSetMinPositionLimit(1, -130.0000) returned 0
emcJointSetMaxPositionLimit(1, 130.0000) returned 0
emcJointSetFerror(1, 0.5000) returned 0
emcJointSetMinFerror(1, 0.5000) returned 0
emcJointSetHomingParams(1, 130.0000, 130.0000, -1.0000, 50.0000, 5.0000, 0, 1, 0, 1, 0) returned 0
emcJointSetMaxVelocity(1, 400.0000) returned 0
emcJointSetMaxAcceleration(1, 5000.0000) returned 0
emcJointActivate(1) returned 0
emcJointSetType(2, 1)
emcJointSetUnits(2, 1.0000)
emcJointSetBacklash(2, 0.0000) returned 0
emcJointSetMinPositionLimit(2, -250.0000) returned 0
emcJointSetMaxPositionLimit(2, 0.0000) returned 0
emcJointSetFerror(2, 0.5000) returned 0
emcJointSetMinFerror(2, 0.5000) returned 0
emcJointSetHomingParams(2, -50.0000, -5.1500, -1.0000, 20.0000, -5.0000, 0, 1, 0, 2, 0) returned 0
emcJointSetMaxVelocity(2, 20.0000) returned 0
emcJointSetMaxAcceleration(2, 200.0000) returned 0
emcJointActivate(2) returned 0
emcJointSetType(3, 2)
emcJointSetUnits(3, 1.0000)
emcJointSetBacklash(3, 0.0000) returned 0
emcJointSetMinPositionLimit(3, -100000000000.0000) returned 0
emcJointSetMaxPositionLimit(3, 100000000000.0000) returned 0
emcJointSetFerror(3, 0.5000) returned 0
emcJointSetMinFerror(3, 0.5000) returned 0
emcJointSetHomingParams(3, 0.0000, 0.0000, -1.0000, 0.0000, 0.0000, 0, 1, 0, 0, 0) returned 0
emcJointSetMaxVelocity(3, 1440.0000) returned 0
emcJointSetMaxAcceleration(3, 2000.0000) returned 0
emcJointActivate(3) returned 0
emcAxisSetMinPositionLimit(0, -130.0000) returned 0
emcAxisSetMaxPositionLimit(0, 130.0000) returned 0
emcAxisSetMaxVelocity(0, 300.0000) returned 0
emcAxisSetMaxAcceleration(0, 3000.0000) returned 0
emcAxisSetLockingJoint(0, -1) returned 0
emcAxisSetMinPositionLimit(1, -130.0000) returned 0
emcAxisSetMaxPositionLimit(1, 130.0000) returned 0
emcAxisSetMaxVelocity(1, 300.0000) returned 0
emcAxisSetMaxAcceleration(1, 3000.0000) returned 0
emcAxisSetLockingJoint(1, -1) returned 0
emcAxisSetMinPositionLimit(2, -250.0000) returned 0
emcAxisSetMaxPositionLimit(2, 0.0000) returned 0
emcAxisSetMaxVelocity(2, 20.0000) returned 0
emcAxisSetMaxAcceleration(2, 200.0000) returned 0
emcAxisSetLockingJoint(2, -1) returned 0
emcAxisSetMinPositionLimit(3, -100000000000.0000) returned 0
emcAxisSetMaxPositionLimit(3, 100000000000.0000) returned 0
emcAxisSetMaxVelocity(3, 1440.0000) returned 0
emcAxisSetMaxAcceleration(3, 1250.0000) returned 0
emcAxisSetLockingJoint(3, -1) returned 0
NML_INTERP_LIST(0x2c9a70)::append(nml_msg_ptr{size=24,type=EMC_TRAJ_SET_TERM_COND}) : list_size=1, line_number=0
/home/pi/linuxcnc/configs/3dprinter/3dprinter.ini:24: executing 'import sys
sys.path.insert(0,"python")'
PythonPlugin: Python  '2.7.9 (default, Sep 17 2016, 20:55:23)
[GCC 4.9.2]'
is_callable(remap.m84) = TRUE
is_callable(remap.m400) = TRUE
NML_INTERP_LIST(0x2c9a70)::append(nml_msg_ptr{size=88,type=EMC_TRAJ_SET_G5X}) : list_size=2, line_number=0
NML_INTERP_LIST(0x2c9a70)::append(nml_msg_ptr{size=88,type=EMC_TRAJ_SET_G92}) : list_size=3, line_number=0
NML_INTERP_LIST(0x2c9a70)::append(nml_msg_ptr{size=24,type=EMC_TRAJ_SET_ROTATION}) : list_size=4, line_number=0
is_callable(__init__) = FALSE
emcTaskPlanInit() returned 0
Waiting for component 'inihal' to become ready..printergui: using INI file  /home/pi/linuxcnc/configs/3dprinter/3dprinter.ini
...Issuing ESTOP RESET
............../home/pi/linuxcnc-dev/bin/milltask (pid 9090) died on signal 11, backtrace stored in /tmp/backtrace.9090
/home/pi/linuxcnc-dev/bin/milltask exiting

<commandline>:0: milltask exited without becoming ready
Axis X button clicked, widget= <gtk.ToggleButton object at 0x73a998f0 (GtkToggleButton at 0xb19930)>
postGUI HAL found:  3dprinter_postgui.hal
Trying to open /dev/hidraw0 ..Failed.
Trying to open /dev/hidraw1 ..Failed.
Trying to open /dev/hidraw2 ..Succes. Querying device for VID/PID..
Found device at /dev/hidraw2
SUCCESS opening device
*ERROR* /home/pi/linuxcnc-dev/bin/milltask (pid 9090) died on signal 11, backtrace stored in /tmp/backtrace.9090
quit with cancel
Shutting down and cleaning up LinuxCNC...
(time=1496785852.322310,pid=9061): Deleting 5 channels from the NML_Main_Channel_List.
(time=1496785852.322511,pid=9061): Deleting emcCommand NML channel from NML_Main_Channel_List.
(time=1496785852.322549,pid=9061): deleting NML (1)
(time=1496785852.322580,pid=9061):  delete (CMS *) 0x1e921f0;
(time=1496785852.322631,pid=9061): rcs_shm_close(shm->key=1001(0x3E9),shm->size=8192(0x2000),shm->addr=0x76f71000)
(time=1496785852.322719,pid=9061): deleting CMS (emcCommand)
(time=1496785852.322764,pid=9061): free( data = 0x1e92c08);
(time=1496785852.322790,pid=9061): Leaving ~CMS()
(time=1496785852.322815,pid=9061):  CMS::delete(0x1e921f0)
(time=1496785852.322836,pid=9061):  CMS::delete successful.
(time=1496785852.322860,pid=9061): Leaving ~NML()
(time=1496785852.322880,pid=9061): NML channel deleted from NML_Main_Channel_List
(time=1496785852.322902,pid=9061): Deleting emcStatus NML channel from NML_Main_Channel_List.
(time=1496785852.322925,pid=9061): deleting NML (2)
(time=1496785852.322946,pid=9061):  delete (CMS *) 0x1e97b78;
(time=1496785852.322970,pid=9061): rcs_shm_close(shm->key=1002(0x3EA),shm->size=16384(0x4000),shm->addr=0x76f6d000)
(time=1496785852.323016,pid=9061): deleting CMS (emcStatus)
(time=1496785852.323042,pid=9061): free( data = 0x1e98578);
(time=1496785852.323066,pid=9061): Leaving ~CMS()
(time=1496785852.323088,pid=9061):  CMS::delete(0x1e97b78)
(time=1496785852.323109,pid=9061):  CMS::delete successful.
(time=1496785852.323131,pid=9061): Leaving ~NML()
(time=1496785852.323151,pid=9061): NML channel deleted from NML_Main_Channel_List
(time=1496785852.323173,pid=9061): Deleting emcError NML channel from NML_Main_Channel_List.
(time=1496785852.323196,pid=9061): deleting NML (3)
(time=1496785852.323217,pid=9061):  delete (CMS *) 0x1e9c9e0;
(time=1496785852.323242,pid=9061): rcs_shm_close(shm->key=1003(0x3EB),shm->size=8192(0x2000),shm->addr=0x76f6b000)
(time=1496785852.323285,pid=9061): deleting CMS (emcError)
(time=1496785852.323311,pid=9061): free( data = 0x1e9d3e0);
(time=1496785852.323334,pid=9061): Leaving ~CMS()
(time=1496785852.323354,pid=9061):  CMS::delete(0x1e9c9e0)
(time=1496785852.323375,pid=9061):  CMS::delete successful.
(time=1496785852.323397,pid=9061): Leaving ~NML()
(time=1496785852.323418,pid=9061): NML channel deleted from NML_Main_Channel_List
(time=1496785852.323439,pid=9061): Deleting toolCmd NML channel from NML_Main_Channel_List.
(time=1496785852.323461,pid=9061): deleting NML (4)
(time=1496785852.323482,pid=9061):  delete (CMS *) 0x1e9f778;
(time=1496785852.323505,pid=9061): rcs_shm_close(shm->key=1004(0x3EC),shm->size=1024(0x400),shm->addr=0x76f6a000)
(time=1496785852.323547,pid=9061): deleting CMS (toolCmd)
(time=1496785852.323573,pid=9061): free( data = 0x1ea0178);
(time=1496785852.323595,pid=9061): Leaving ~CMS()
(time=1496785852.323616,pid=9061):  CMS::delete(0x1e9f778)
(time=1496785852.323637,pid=9061):  CMS::delete successful.
(time=1496785852.323659,pid=9061): Leaving ~NML()
(time=1496785852.323679,pid=9061): NML channel deleted from NML_Main_Channel_List
(time=1496785852.323700,pid=9061): Deleting toolSts NML channel from NML_Main_Channel_List.
(time=1496785852.323722,pid=9061): deleting NML (5)
(time=1496785852.323743,pid=9061):  delete (CMS *) 0x1ea0940;
(time=1496785852.323768,pid=9061): rcs_shm_close(shm->key=1005(0x3ED),shm->size=8192(0x2000),shm->addr=0x76ecf000)
(time=1496785852.323811,pid=9061): deleting CMS (toolSts)
(time=1496785852.323836,pid=9061): free( data = 0x1ea1340);
(time=1496785852.323858,pid=9061): Leaving ~CMS()
(time=1496785852.323878,pid=9061):  CMS::delete(0x1ea0940)
(time=1496785852.323899,pid=9061):  CMS::delete successful.
(time=1496785852.323921,pid=9061): Leaving ~NML()
(time=1496785852.323941,pid=9061): NML channel deleted from NML_Main_Channel_List
(time=1496785852.323969,pid=9061): deleting NML (1)
(time=1496785852.323991,pid=9061): Leaving ~NML()
(time=1496785852.324013,pid=9061): NML::operater delete(0x1e92008)
(time=1496785852.324036,pid=9061): NML channel deleted from Dynamically_Allocated_NML_Objects
(time=1496785852.324061,pid=9061): deleting NML (2)
(time=1496785852.324082,pid=9061): Leaving ~NML()
(time=1496785852.324102,pid=9061): NML::operater delete(0x1e979f0)
(time=1496785852.324124,pid=9061): NML channel deleted from Dynamically_Allocated_NML_Objects
(time=1496785852.324149,pid=9061): deleting NML (3)
(time=1496785852.324170,pid=9061): Leaving ~NML()
(time=1496785852.324190,pid=9061): NML::operater delete(0x1e9c7d8)
(time=1496785852.324212,pid=9061): NML channel deleted from Dynamically_Allocated_NML_Objects
(time=1496785852.324234,pid=9061): deleting NML (4)
(time=1496785852.324255,pid=9061): Leaving ~NML()
(time=1496785852.324275,pid=9061): NML::operater delete(0x1e9f5f0)
(time=1496785852.324297,pid=9061): NML channel deleted from Dynamically_Allocated_NML_Objects
(time=1496785852.324319,pid=9061): deleting NML (5)
(time=1496785852.324340,pid=9061): Leaving ~NML()
(time=1496785852.324359,pid=9061): NML::operater delete(0x1ea07b8)
(time=1496785852.324381,pid=9061): NML channel deleted from Dynamically_Allocated_NML_Objects
dSpin module unloaded
Note: Using POSIX realtime

INI file: github.com/dabit20/rpi_cnc/blob/master/l...rinter/3dprinter.ini

When I remove 'position.txt' ([TRAJ]POSITION_FILE = position.txt), LinuxCNC starts again.

position.txt contains no weird information when LinuxCNC refuses to start, and there are no extra characters in the file (verified with hexdump)
-121.33456999940048408
130.00000000000000000
122.67128547399834712
0.00000000000000000
0.00000000000000000
0.00000000000000000
0.00000000000000000
0.00000000000000000
0.00000000000000000
fsck shows that the filesystem is clean.

It is not really an issue; I can live without stored positions, but I thought I would post it anyway.

Please Log in or Create an account to join the conversation.

More
07 Jun 2017 11:52 #94221 by andypugh
If it is always the same PID (or, at least, always milltask, then that sounds like a software bug.
But, if it is random what module is segfaulting then it could be bad memory (tricky to fix with a Pi).

Make sure that you have a good PSU, I have heard a lot recently about the Pi being flaky with a marginal PSU.

Please Log in or Create an account to join the conversation.

More
07 Jun 2017 16:41 #94250 by DaBit
Power to the Pi comes from a 3Amp DC/DC set to 5.25V with separate decent quality cables to display and Pi. The Pi itself is underclocked to 1GHz to help stability with the PREEMPT_RT kernel always running the cores at this frequency. This underclocking is probably not necessary anymore with the extra cooling but I do not need the extra performance either.

It is always milltask that is segfaulting, and always at the same position in the initialisation sequence. When LinuxCNC runs, it runs for days without issues.
The segfaulting of milltask lives through reboots. Thus, if it happens, it keeps happening, also when I reboot. Until I remove positions.txt, then LinuxCNC starts normal again.

I expected filesystem/storage corruptions (Pi's are also known for corrupting SD cards), but I cannot find a sign of it.

So far I have not been able to reproduce it with a sim config, but then again, it happens after many (>100) LinuxCNC starts.

Please Log in or Create an account to join the conversation.

More
07 Jun 2017 17:47 #94253 by andypugh
position.txt is written on exit. I wonder if sometimes the Pi shuts down before finishing the write?

Do you actually need the positon.txt ?

Please Log in or Create an account to join the conversation.

More
07 Jun 2017 20:22 #94264 by DaBit
The Pi is running 24/7 at the moment. LinuxCNC is started and stopped, though.

I do not need position.txt, so it is not a real issue for me. Once I only have to change things occasionally the rootfs is switched to readonly anyway (using overlayfs; SD card as underlay, tmpfs as overlay) , meaning that all changes are reverted when power is cycled. I like that behaviour in embedded systems.

I just noticed this milltask segfault behaviour and thought I would mention it.

Please Log in or Create an account to join the conversation.

Time to create page: 0.100 seconds
Powered by Kunena Forum