possible bug with 2.7.8 shut down of linuxcnc

More
17 Jan 2017 02:05 #85884 by OT-CNC
I'm in the process of wiring/configuring my 7i77 and experienced a shut down of linuxcnc. See the attached screen notification.
I did a memory test for over an hour and all checked out okay. I was able to replicate the shut down 3 times when using the up/down keyboard keys to rapidly change direction then hitting home and end key. I think it was the end key that triggered the black screen and or a combination of page down and end. I noticed this when testing the Z axis.
Can anyone else replicate this??
Attachments:

Please Log in or Create an account to join the conversation.

More
17 Jan 2017 02:17 #85886 by tommylight
That looks like hardware problem, so the first thing i would do is change the computer, or at least the power supply on that computer. But that is a Dell optiplex 755 so it is hard to find a working power supply for them.
Open the case and check the capacitors, they tend to go bad, the top starts to swell. They should all be flat topped. In case they are swollen, change the computer.
That might also be a bad pci contact (if using 5i25 or 6i25), as it happened reading the sserial port.

Please Log in or Create an account to join the conversation.

More
17 Jan 2017 03:10 #85889 by OT-CNC
Tommylight thanks for the response. I will check my hardware. I have no issues running code or jogging my axis except for when I hit the end key by accident while cycling through the z up/down key. If it's hardware I would expect it to be random, no?

Please Log in or Create an account to join the conversation.

More
17 Jan 2017 12:19 #85907 by tommylight
Yes and no, that is depending on what is actualy wrong, and that is not easy to pinpoint. In most cases under linux you see this type of error from bad memory or memory controller, and the error hapens only when accesing that certain memory heap. Usually that problem only gets worse with time, or can go days without apearing depending on the use. That can be caused by high temperatures, so do check that too.

Please Log in or Create an account to join the conversation.

More
17 Jan 2017 14:14 #85919 by OT-CNC
I ran the memory test for over 1.5 hrs and the report was fine. If it's the memory controller or the memory would that not show during testing? I'll re-run that again for longer if it's at all a reliable indicator. I'll check temp too but I can tell you that ambient shop temp is low here this time of the year and I was able to trigger the error within the first few minutes of powering on.
Would installing an older version of linuxcnc rule out a bug and verify that it's a hardware issue?
Could someone in the mean time try the same key strokes on their machine to see if they get a computer crash?

Please Log in or Create an account to join the conversation.

More
17 Jan 2017 14:52 #85929 by PCW
Is it always the same error and same place? (in emcmotCommandHandler)
if so that does suggest software a bit more than hardware.

A simple brute force hardware test would be to use another PC just for a quick verification
(just swap hard drive and FPGA card)

Please Log in or Create an account to join the conversation.

More
17 Jan 2017 16:14 #85942 by jmelson
Ohh, a Dell 755, I think it might have the i810 chipset, or descendant. I had occasional problems with a graphics crash on an Optiplex GX260 with the i810. It never happened with the Ubuntu 8.04 LinuxCNC, happened occasionally with the Ubuntu 10.04 LinuxCNC, but became a HUGE problem with the Debian LinuxCNC. Several people recommended hacks to the X11 config files, but they didn't fix it. In all these cases, the kernel would stay up, but the graphics would go haywire and X would shut down. (That doesn't exactly match your result, you seem to have a kernel panic.)

Anyway, the fix I finally had to go with was to get an Nvidia graphics card and disable the on-board graphics. Have not had any problems with the system since. There is quite a bit of documented problems with the Intel graphics chips with later kernels, if you look around. This not only applies to the i810, but several later integrated graphics chips.

Jon

Please Log in or Create an account to join the conversation.

More
17 Jan 2017 16:36 #85948 by OT-CNC
Bummer, that computer came up as a good candidate and I looked for it. I didn't realize it was not suitable for Debian.
I don't have another PC to test at the moment unfortunately.

Please Log in or Create an account to join the conversation.

More
17 Jan 2017 17:16 - 17 Jan 2017 17:17 #85959 by jmelson
Well, I got the Nvidia card for less than $15 on eBay. And, of course, I'm really GUESSING at YOUR problem, it could be something totally different. If you have any old graphics cards lying around, you could try one out and see if the problem goes away. In my case, my GX260 was slim form factor, so I jammed a full-height card I had here, left the care open and observed that fixed the problem. So, I had a lot of confidence that I could fix it. Then, I just had to search for a half-height graphics card and I was in business.

Jon
Last edit: 17 Jan 2017 17:17 by jmelson.

Please Log in or Create an account to join the conversation.

More
17 Jan 2017 18:42 #85974 by tommylight
Just tested on my laptop (dell e6510, mint17.3, preempt-rt) , no crash no matter what i do with up/down/home/end buttons. Will try tomorrow on the actual machine and report back.
In the mean time, did you have a look at the capacitors? If they are swollen, no advice will help. If they are all nice and flat, see bellow.
Since it is a recurring error, do open the case and remove all but one memory cards, start the computer and try to induce the error, try this with all the memory cards you have separately. Always put the memory on the slot nearest to processor. If they all fail, try the next slot and so forth.
If you get the same result with more that 2 memory cards, chances are the memory controller on the north bridge is gone the way of the do do.

Please Log in or Create an account to join the conversation.

Moderators: PCWjmelson
Time to create page: 0.158 seconds
Powered by Kunena Forum