machine crash with linux cnc / EMC2

More
18 Jun 2014 16:26 #48056 by robbe
Hello everybody,

my name is Robert, I am 29, designer, cnc-hobbyist and from Germany.
I am facing a very complicated problem with my machine control.
I have EMC2 on Ubuntu 9.10. in use.
The whole on a Atom CPU board with isolated CPU step generation.
The machine is controlled via the parallel port.
Connected to the breakout board of Benezahn.de and 3 step / direction servo controller.
Base - and servo thread at about 4500 (after a long-term test with heavy load), which I've given in Stepconf.

Everything is going very well, except for occasional, but intense machine crashes.
First appears "unexpectet Real Time Delay" and a "joint 0 -" and "joint 1 Following Error".
A second later, the milling spindle is touring on to maximum and the tool moves in X / Y with maximum feed into the workpiece. In about 200 hours of operation, has happened 4 times.
The error occurred on short and long even at different g-codes.
The g-codes themselves are correct.

I myself have only little knowledge of Linux.
A friend has read all the log files that might be helpful.
These logs and my entire machine configuration attached to this post.
The most intresting part is this one i think:
May 28 15:55:18 robotron-cnc-2 rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="592" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'.
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938377] RTAPI: ERROR: Unexpected realtime delay on task 1
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938389] This Message will only display once per session.
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938392] Run the Latency Test and resolve before continuing.
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938407] 6696133: ERROR: joint 0 following error
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938412] 6696133: ERROR: joint 1 following error
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938417] 6696134: ERROR: Unexpected realtime delay: check dmesg for details.
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938430] 
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938432] In recent history there were
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938434] 1580520, 1590384, -2012888096, 1597224, and 1591692
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938436] elapsed clocks between calls to the motion controller.
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938450] This time, there were 35832 which is so anomalously
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938453] large that it probably signifies a problem with your
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938456] realtime configuration.  For the rest of this run of
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938465] EMC, this message will be suppressed.
May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938468] 
May 28 17:44:21 robotron-cnc-2 kernel: [ 7425.851039] r8169: eth0: link down
May 28 17:44:35 robotron-cnc-2 kernel: [ 7440.222120] r8169: eth0: link up
May 28 18:08:41 robotron-cnc-2 kernel: [ 8886.371053] parport_pc 00:06: disabled
May 28 18:08:42 robotron-cnc-2 kernel: [ 8886.861766] RTAI[math]: unloaded.
May 28 18:08:42 robotron-cnc-2 kernel: [ 8887.152515] RTAI[malloc]: unloaded.
May 28 18:08:42 robotron-cnc-2 kernel: [ 8887.252054] RTAI[sched]: unloaded (forced hard/soft/hard transitions: traps 0, syscalls 0).
May 28 18:08:42 robotron-cnc-2 kernel: [ 8887.262732] I-pipe: Domain RTAI unregistered.
May 28 18:08:42 robotron-cnc-2 kernel: [ 8887.262861] RTAI[hal]: unmounted.
May 28 18:08:56 robotron-cnc-2 kernel: Kernel logging (proc) stopped.
May 28 18:08:56 robotron-cnc-2 rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="592" x-info="http://www.rsyslog.com"] exiting on signal 15.

I hope my english is readable.

Thank you for your help in advance.


Die Robbe

Please Log in or Create an account to join the conversation.

More
18 Jun 2014 17:47 #48060 by robbe
I relized, that Firefox can not add files.
So I try again with IE.
Attachments:

Please Log in or Create an account to join the conversation.

More
18 Jun 2014 17:58 - 18 Jun 2014 17:59 #48061 by ArcEye

Base - and servo thread at about 4500 (after a long-term test with heavy load),


In your ini file, your servo period for a normal stepper set up should be 1,000,000

The base period, if set at 4500 is far too short. That would account for losing steps, getting following errors, getting realtime errors, especially at high speed

Double your base period and see if you get any errors. You can reduce it further until you get errors if you like and back off up again.

Theoretical base periods, based upon latency tests are fine, but if too tight give no room for the smallest glitch

I have EMC2 on Ubuntu 9.10. in use.
The whole on a Atom CPU board with isolated CPU step generation.


What kernel are you using?
The Atom runs well on the 10.04 build on the Live CD

I am guessing you mean you are using isolcpus=1 in the kernel parameters, otherwise you will have to explain

regards
Last edit: 18 Jun 2014 17:59 by ArcEye.

Please Log in or Create an account to join the conversation.

More
18 Jun 2014 18:02 #48062 by ArcEye
Hi

Just got your configs

Your servo period is right but your base period is quite tight at 20,070,

Try increasing that to 30,000 and see if you get any more errors. That is still quite fast and should be much more reliable

regards
The following user(s) said Thank You: andypugh

Please Log in or Create an account to join the conversation.

More
23 Jun 2014 20:13 #48199 by robbe
Hi,

i will try to repeat the failure to see, if changed parameters solve the problem.
Still a hard thing, cause the problem occours one time in 100h.

isolcpus=1 is right, thats what i ment.

In an german forum somebody tould me, that the big negative Number in the Errorlog may be the Problem i have:

May 28 17:33:30 robotron-cnc-2 kernel: [ 6774.938434] 1580520, 1590384, -2012888096, 1597224, and 1591692

regards

Please Log in or Create an account to join the conversation.

More
17 Jul 2014 14:57 #48865 by robbe
I have run the computer without the machine connected.
I could now reproduce the error in 3 out of 3 tests.
But that takes a few hours.

The appendix contains the log data sorted by start date and start time.
Maybe anyone can see a pattern in it, which leads to the error.

regards
Attachments:

Please Log in or Create an account to join the conversation.

More
17 Jul 2014 19:32 #48888 by andypugh
You are getting some sort of realtime error.

You also seem to (possibly) have something odd going on with the USB keyboard. Is there any chance of trying with an olde-worlde PS2 keyboard?

Please Log in or Create an account to join the conversation.

More
17 Jul 2014 19:52 #48892 by PCW
I agree with Andy, you have a hardware issue with USB
(and hardware issues especially with USB or the harddrive, often break realtime)

[ 6060.048106] hub 5-0:1.0: port 1 disabled by hub (EMI?), re-enabling...
[ 6060.048119] usb 5-1: USB disconnect, address 5

Please Log in or Create an account to join the conversation.

More
18 Jul 2014 15:15 #48914 by robbe
Thank you for this hint.

I have now run a test with disconnected Mouse and Keyboard.
If this also shows an error I will try to disable the USB ports in the BIOS and plug the controls in via PS/2.

The fact, that the USB errors occur a long time before the realtime crash is still a little strange to me.

Please Log in or Create an account to join the conversation.

More
18 Jul 2014 21:21 #48926 by andypugh

The fact, that the USB errors occur a long time before the realtime crash is still a little strange to me.

Indeed, I would be more sure that was the cause if it was closer in time. I was suggesting an experiment rather than a guaranteed fix.

Please Log in or Create an account to join the conversation.

Time to create page: 0.092 seconds
Powered by Kunena Forum