Keras callback execution order? - keras

I'm trying to understand Keras callback execution order. Suppose we pass multiple callbacks to model.fit(), each has a on_epoch_end method. So the moment we reach the end of an epoch, in which order will the callback functions be executed? Does the main process spawn multiple child-process and assign one to each callback?
It'd nice if the documentations are more detailed.

They should be called in the order you've added them.
If you look at the implementation of CallbackList class which manage your callbacks, you will see it's iterating by order of appearance.
For example here in on_epoch_end.
Also, this is how the class is used in training loop and it does not seems that a separate process is spawn.

They will be executed in the order they are specified in your
callbacks list inside the model.fit()
child process is not created by a parent to perform execution

Related

In AzureML, start_logging will start asynchronous execution or synchronous execution?

It was written in the Microsoft AzureML documentation, "A run represents a single trial of an experiment. Runs are used to monitor the asynchronous execution of a trial" and A Run object is also created when you submit or start_logging with the Experiment class."
Related to start_logging, as far as I know, when we have simply started the run by executing this start logging method. We have to stop, or complete by complete method when the run is completed. This is because start_logging is a synchronized way of creating an experiment. However, Run object created from start_logging is to monitor the asynchronous execution of a trial.
Can anyone clarify whether start_logging will start asynchronous execution or synchronous execution?
start_logging will be considered as asynchronous execution as this generates the multiple interactive run sessions. In a specific experiment, there is a chance of multiple interactive sessions, that work parallelly and there will be no scenario to be followed in sequential.
The individual operation can be performed and recognized based on the parameters like args and kwargs.
When the start_logging is called, then an interactive run like jupyter notebook was created. The complete metrics and components which are created when the start_logging was called will be utilized. When the output directory was mentioned for each interactive run, based on the args value, the output folder will be called seamlessly.
The following code block will help to define the operation of start_logging
experiment = Experiment(your_workspace, "your_experiment_name")
run = experiment.start_logging(outputs=None, snapshot_directory=".", display_name="test")
...
run.log_metric("Accuracy_Value", accuracy)
run.complete()
the below code block will be defining the basic syntax of start_logging
start_logging(*args, **kwargs)

Stop tensorflow training process programmatically

I'm writing some platform for run trainigs on the server and I need some option to halt the training process via API. It's mean when I receive somr request to REST controller, the main threads need to stop the training that can take some days time long.
I see that Tensorflow have Coordinator class and EarlyStopping callback, but I don't see nothing that can stop thetraining by demand.
Something like: model.stop()
Yes you can but it is a bit different.
You can't tell model to stop but the model can ask you if it should stop. That is done using a callback.
Here is a simple thing that you could do:
Implement your callback
documentation here. Your callback could check for, let's say, a file in a folder. Just an example "../.../stop_training.txt"
Add your callback to your model with the event you want to use.
Create an API called "https://..../stop-training" which just creates that stop_training.txt file.

Inheritance of variables in callbacks

I have the problem that I receive several events in a nodejs worker in very short intervals. These call several callbacks, as shown in the figure, which must access the parameters of the event in each callback. The structures of the callbacks can not be changed, so the transfer as a parameter is not possible.
My idea was to store the parameters in a global object and access them via the "event-loop" identifier.
Unfortunately there seems to be no such identifier.
Another idea was to inherit the parameters as a global variable for all callbacks within the event. However, I did not find how to make such a variable inheritable for the other callbacks.
Does anyone have a suggestion how to solve this problem?

Treat async code as threads in Node.js?

Kind of a weird question, Imagine you have a situation where you need to run 10 SYNCRONOUS functions, it doesn't matter when they complete, you just want to know when all 10 are done: I.E.
f1()
f2()
f3()
...
f10()
doStuffWithResult();
Now, If you use promises like so, assuming you have rewrote each as promoises:
Promise.All([f1,f2,f3,f4,f5,f6,f7,f8,f9,f10])
.then(() => {
doStuffWithResult();
})
Would you see a performance increase? Theoretically, I want to say no because these functions are still synchronous, and everything is still running on one thread.
Thanks!
Would you see a performance increase?
No, what you are proposing would not be faster.
Promises do not create threads. All they do is provide a cooperative system for keeping track of when asynchronous operations are complete and then notifying interested parties of success or failure. They also provide services for propagating errors when asynchronous operations are nested.
And, your proposed Promise.all() code would not even work. You must pass an array of promises to Promise.all(), not an array of function references. In your example, your functions would not even be called.
And, if you changed your code to something that would actually execute, then it would likely be slower than just calling the synchronous functions directly because you'd be executing promise code and all .then() handlers execute on a future tick (not synchronously).
In node.js, the only way to execute synchronous things in parallel is to launch child processes (that can execute in parallel) or pass the operations to some native code that can use actual OS threads.

Good approaches for queuing simultaneous NodeJS processes

I am building a simple application to download a set of XML files and parse them into a database using the async module (https://npmjs.org/package/node-async) for flow control. The overall flow is as follows:
Download list of datasets from API (single Request call)
Download metadata for each dataset to get link to XML file (async.each)
Download XML for each dataset (async.parallel)
Parse XML for each dataset into JSON objects (async.parallel)
Save each JSON object to a database (async.each)
In effect, for each dataset there is a parent process (2) which sets of a series of asynchronous child processes (3, 4, 5). The challenge that I am facing is that, because so many parent processes fire before all of the children of a particular process are complete, child processes seem to be getting queued up in the event loop, and it takes a long time for all of the child processes for a particular parent process to resolve and allow garbage collection to clean everything up. The result of this is that even though the program doesn't appear to have any memory leaks, memory usage is still too high, ultimately crashing the program.
One solution which worked was to make some of the child processes synchronous so that they can be grouped together in the event loop. However, I have also seen an alternative solution discussed here: https://groups.google.com/forum/#!topic/nodejs/Xp4htMTfvYY, which pushes parent processes into a queue and only allows a certain number to be running at once. My question then is does anyone know of a more robust module for handling this type of queueing, or any other viable alternative for handling this kind of flow control. I have been searching but so far no luck.
Thanks.
I decided to post this as an answer:
Don't launch all of the processes at once. Let the callback of one request launch the next one. The overall work is still asynchronous, but each request gets run in series. You can then pool up a certain number of the connections to be running simultaneously to maximize I/O throughput. Look at async.eachLimit and replace each of your async.each examples with it.
Your async.parallel calls may be causing issues as well.

Resources