mtoivo wrote:First thing is the situation where master luxconsole process crashes. The slaves happily keep on processing the samples and wait for their now dead master to call back someday. But if the master restarts and tries to connecto to the slaves, they just say "I'm BUSY here can't you see, go away". I have to admit I havent looked much into the network code (which I should, I know), but I've concluded there's no constant connections from master to slaves. Master only connects to gather samples etc. Should the connection be permanent, I don't know, but if it were on all the time, the slaves could react instantly when the master disappears. At least it would be nice if some kind of timer would kill the processing if the master hasn't contacted slaves for a while. Do you agree?
mtoivo wrote:Another issue in the network rendering is "over-sampling" that occurs more or less everytime. When master reaches the halt spp value, it could tell the slaves to stop too. You might think this is no big deal, because usually the sampling stops at least after all slaves have been contacted for the last sample gathering. But with a cluster of 64 slaves, that takes a while, I tell you. The rendering itself might be completed in ~10minutes with enough cores samping, but the over-samping can take the same time, if not more. This is not such a show stopper, because it can be overcame with different master/slave combinations, but an idea worth thinking of.
mtoivo wrote:The last two things are about flatting a FLM to PNG. In LuxGUI, you can open plain FLM, but with luxconsole you can't. You have to have .lxs and the rest of the crew too. Is this absolutely necessary? I mean, if there's no need to sample anymore, just output the png. Why is this important? In my case, I'm merging many FLMs into one and then outputting a PNG from the result. Keeping the .lxs and the export directory just for the final .png output to work seems a bit dull. Also, when rendering said FLMs that are about to be merged into one, there's really no need to have png of those sub-FLMs. But if I turn png write off, the final ouput will not produce .png either, since the decision to do that is written in the .lxs somewhere.
jeanphi wrote:mtoivo wrote:First thing is the situation where master luxconsole process crashes. [...] At least it would be nice if some kind of timer would kill the processing if the master hasn't contacted slaves for a while. Do you agree?
Worth looking into, but your idea will also have drawbacks: imagine you have a network failure (even a very small disconnect), by reacting instantly you might destroy hours of work.
Abel wrote:I do like the idea of a time out for the slaves. On the other hand, in my setup an alternative way to deal with this situation would be to give the master the power to give a "drop whatever you're doing and listen to me" command; that way I wouldn't have to go to all the nodes to manually exit luxconsole and restart it in case something went wrong with the master.
Lord Crc wrote:We could add a command line flag to the slave which changes the behavior when the session key doesn't match. Instead of rejecting the connection request it could instead abort the current session and accept the new one.
Abel wrote:That would be really useful, but could also to lead to unwanted scenario's where two masters are both controlling a bunch of slaves: one could hijack each other's slaves and two masters could try to take control of the same slave.
Abel wrote:-the master has an "evil" mode, forcing the nodes to start with the new job even if they are in "active" mode. This is useful when restarting jobs with minor modifications, in which case one won't have to wait for the time out period to pass.
Lord Crc wrote:Abel wrote:-the master has an "evil" mode, forcing the nodes to start with the new job even if they are in "active" mode. This is useful when restarting jobs with minor modifications, in which case one won't have to wait for the time out period to pass.
What we could do is to optionally have the master add a "force" parameter when connecting, and the slave would only abandon the current scene if this "force" parameter is present, and forcing is enabled (ala my suggestion above). This could be exposed as a button in the gui or something ("Connect (forced)" fex). This should prevent "accidental" session takeover.
Users browsing this forum: No registered users and 1 guest