offsetpage_widget.py error while running smartprob

More
30 Nov 2015 10:30 #66097 by DaBit
I initially thought it was a gmoccapy error, but it turns out that it is a level deeper. newbyobi suggested that I asked for help here.

When running a slightly modified version of smartprobe.ngc as provided in the samples directory I get this error sooner or later:



When this happens program execution continues as usual. One time I have seen that the updating of the visited paths in the gremlin (purple lines) stops, other times everything behaves exactly like normal after closing that window.

Modifications to the provided smartprobe.ngc:
- (LOG,..) command changed to better suit my needs
- G38.2 command changed to G38.3+IF [#5070 NE 0] ..
- Code braced in IF [#<_task> EQ 1] .. to prevent preview interpreter overload with tens of thousands of probe moves.
In other words: nothing special.

LinuxCNC version: 2.8 RIP build running on the Debian Wheezy provided at linuxcnc.org.

This error happens some 10-30 minutes into smartprobe.ngc execution, and then only several hours later. It is repeatable.
I am only seeing this with smartprobe.ngc and I have never seen this error before, also not when executing a lenghty 3D milling program that runs for 1-3 hours.
When it happens, I can access SnSMill.var normally, both read and write. Although write was hard to test since LinuxCNC is continuously accessing that file.

I doubt it is helpful, but I have a small video of program execution here: youtube.com/watch?v=mdQss2eHRvE

Please Log in or Create an account to join the conversation.

More
30 Nov 2015 12:54 #66102 by ArcEye
You may have to wait for Chris Morley who wrote offsetpage_widget.py for a definitive answer.

I would suspect that the gladevcp widget is trying to access the ini or the var file , whilst another part of the main linuxcnc program has it open.
There are more accesses than you might suspect, as you found.
( I say the ini file or var file, because in fact interp::ini_load() actually parses the inifile to get the var file name every time )

The gladevcp widget checks that the file is not being edited by itself at :434
if self.filename and not self.editing_mode:
            self.reload_offsets()
but appears to have no awareness of other users of it

The first question is probably, does running your copy of smartprobe.ngc under Axis, trigger any errors?

I suspect it will not.

regards

Please Log in or Create an account to join the conversation.

More
30 Nov 2015 13:15 - 30 Nov 2015 13:22 #66105 by DaBit
The 'problem' is that gmoccapy is quite deeply rooted in my configuration; switching to axis or gscreen and keeping the configuration as much as-is to prevent comparing a fully loaded apple to a bare metal orange is doable but not a 5-minute job.

Newbynobi found something ( link ):

Hallo Dabit,
I checked the code of smartprobe and the offsetpage-widget and I am pretty sure I know where the problem is.

Smartprobe does write to the file while probing, Parameter 5070 and 5061 - 5069 and offsetpage-widget reads periodically from the same file parameter the offsets for the different coordinate systems (parameter 5229 - 5381)

There is by sure a race conflict, smartprobe have the file opened and blocked, while offsetpage tries to read from it.

So IMHO the offsetpage-widget should have the reading in a try: except: way.

Chris Morley does mantain the gladevcp widget.

Norbert


The error itself seems to be of the uncritical kind; it does not block machine operation. I have seen the preview display freezing once, but usually not even that happens and everything chugs along nicely. So waiting for Chris M to have some time available is no problem at all.
Last edit: 30 Nov 2015 13:22 by DaBit.

Please Log in or Create an account to join the conversation.

More
30 Nov 2015 13:37 #66107 by jepler
I have investigated this for just a few minutes and I think the developer of the gladevcp offset widget should investigate it as a bug in that widget.

The offset widget is working by reading the "var file". This file exists for the interpreter to save numbered variables, including offsets. Any other program which wishes to use the "var file" is subject to whatever whims the interpreter has about updating it.

The sequence of operations used by the interpreter is:
  1. write the new var file to a temporary var file
  2. delete the var file
  3. rename the temporary var file to the var file name

If the offset widget happens to access the var file between steps 2 and 3, the file will not be found. The offset widget must add code to gracefully recover from this condition.

Please Log in or Create an account to join the conversation.

More
30 Nov 2015 13:44 - 30 Nov 2015 13:44 #66108 by ArcEye

The 'problem' is that gmoccapy is quite deeply rooted in my configuration; switching to axis or gscreen and keeping the configuration as much as-is to prevent comparing a fully loaded apple to a bare metal orange is doable but not a 5-minute job.


There is no need, it would completely eliminate other factors, but little doubt the error is in the widget itself.

Smartprobe does write to the file while probing, Parameter 5070 and 5061 - 5069 and offsetpage-widget reads periodically from the same file parameter the offsets for the different coordinate systems (parameter 5229 - 5381)

There is by sure a race conflict, smartprobe have the file opened and blocked, while offsetpage tries to read from it.

So IMHO the offsetpage-widget should have the reading in a try: except: way.


That is pretty much what I was saying and should think that offsetpage_widget must account for the possibility that the file will be 'busy' or not exist if between stages of read and re-write

regards
Last edit: 30 Nov 2015 13:44 by ArcEye.

Please Log in or Create an account to join the conversation.

More
30 Nov 2015 14:18 #66110 by newbynobi
Hallo Jeff,

The sequence of operations used by the interpreter is:
- write the new var file to a temporary var file
- delete the var file
- rename the temporary var file to the var file name


Why is this working in such a strange way?
Why not just open the file, modify its content an close the file again?
It is just for curiosity, IMHO it sounds old fashion and complicated and will need more resources, won't it?.

Norbert

Please Log in or Create an account to join the conversation.

More
30 Nov 2015 17:42 #66124 by jepler
Actually I got at least one detail of the interpreter's sequence of operations wrong. Please review Interp::save_parameters for the accurate sequence. However, it's still the case that at a certain step of the interpreter updating the contents of the var file, the var file is temporarily unavailable. And based on my current reading, there are additional moments where the file exists but its contents are not completely written (most likely, you would see an empty file but it is not impossible that you might see only the initial portion of the file)

Updating the file "in place" is not feasible for two reasons. First, because the numbers "1" and "10" take a varying number of bytes. Second, because the implementation of variable file writing needs to read the old variable file to determine which numbered parameters are saved.

I am not opposed to designing a better sequence for the interpreter to write out the variable file; perhaps a scheme which uses UNIX hard links can even ensure that the var file always exists and is always complete. But that kind of change is much harder to get right than having code which reads the var file deal gracefully with it not being present or not containing all the usual values. It is my opinion that a patch to fix the latter is a much better choice to bugfix a stable version of linuxcnc.

For the master branch I would be happy to review a patch to improve var file writing but unless it turns out to be exceptionally simple (or that working around this in the other reader of the var file is less feasible than I examine) I would not advocate changing Interp::save_parameters in the stable versions.

(as a historical note, it's possible that using the link syscall was shied away from in the name of portability [to Windows]. I don't care whether the LinuxCNC gcode interpreter is portable to Windows)

Please Log in or Create an account to join the conversation.

More
30 Nov 2015 23:25 #66137 by DaBit
I am not a software guy, not even close. I can write some code on a basic to intermediate level as long as it includes fiddling with bits and bytes and a GUI is nowhere near the code. Check the components on the wiki for an idea of my abilities, not that fantastic.

That said: wouldn't some standard solution like for example SQLite be a much better solution for storage of data that is accessed by multiple clients of possibly multiple versions be a better idea? The problem with 'roll your own' solutions for read/write file access, client/server things, etc. is that making it work in a controlled environment is not hard, but making it work in the harsh outside world under all those circumstances you didn't even think of is much, much, much harder.
The same for things like the tool table; the current structure makes it not that straightforward to synchronise the tools between CAM and control. In my case this is only partially automatic and probably not robust at all.

If this just sounds like rubbish from someone who clearly doesn't know what he is talking about, feel free to ignore it.

Please Log in or Create an account to join the conversation.

More
06 Dec 2015 01:01 #66489 by cmorley
To answer your question: yes something else could be better. The problem is finding someone with interest and ability to do it.
the problem isn't beg enough yet to force us to address it.

I pushed a fix for the offset widget to 2.6, 2.7 and master.
It will be available in the next bug fix release or on buildbot now.
Thanks for the report.

Chris M

Please Log in or Create an account to join the conversation.

More
06 Dec 2015 10:57 #66504 by newbynobi

To answer your question: yes something else could be better. The problem is finding someone with interest and ability to do it.
the problem isn't beg enough yet to force us to address it.


Hallo Chris,

I do not agree to 100 %.
IMHO you should place it as a issue, otherwise in half a year we all will have forgotten it.

Norbert

Please Log in or Create an account to join the conversation.

Moderators: HansU
Time to create page: 0.097 seconds
Powered by Kunena Forum