Hyperparameter metric in Google Cloud ML should contain the `val` prefix? - keras

When defining the hyperparameter metric for Google Cloud ML I can use mean_squared_error, but should I be using val_mean_squared_error instead if I want it to be comparing the validation set accuracy? Or does it do it on its own?
This is the sample hptuning config:
trainingInput:
...
hyperparameters:
goal: MINIMIZE
hyperparameterMetricTag: ???mean_squared_error
And this is the fit invocation:
history = m.fit(train_x, train_y, epochs=epochs, batch_size=2048,
shuffle=False,
validation_data=(val_x, val_y),
verbose=verbose,
callbacks=callbacks)
Since I am passing my validation data and Keras, I am in doubt whether I should use val_mean_squared_error.

The answer is: if you (I) want Google Cloud ML hyperparameter tuning to use the VALIDATION metric instead of the training metric while using Keras, you need to specify val_mean_squared_error (or val_accuracy etc).
If you stick to accuracy or mean_squared_error you will bias Google Cloud ML tuning process to select overfitting models. To avoid the overfitting while searching for the parameters you should either create your own metric (as mentioned in a comment) or use the fit method with a validation set and us the val metrics.
I've updated the question to explicitly say I am using Keras, which automatically creates the val_mean_squared_error.
To get the answer I realized I could do a simple test: both with val_mean_squared_error and mean_squared_error while using Keras and invoking fit with the validation parameter set and compare the job results with the reported metrics.

Related

HP Tuning with keras model and setting hyperparameterMetric to a Evaluation Metric and not training metric

It is about Hyperparameter Tuning with GCP.
With estimators I can easily set the desired hyperparameterMetric to the proper metric on evaluation data. But I don't see how I can do that for a keras (tf.keras and keras) model?
I mean where can I "assign" the right metric? I need the hyperparameterMetric to be the metric for evaluation data.
Edit:
model.fit returns a dict like:
{'acc': [0.9843952109499714],
'loss': [0.050826362343496051],
'val_acc': [0.98403786838658314],
'val_loss': [0.0502210383056177]
}
Does GCP works now if I just set my desired validation metric to 'val_acc' in the config file without doing anything else?
you must use keras callback of tensorboard
use prefix "epoch_" for ParameterMetricTag=epoch_val_acc for validation accuracy and ParameterMetricTag=epoch_acc for training accuracy

Monitor F1 Score (or a custom metric in general) in a keras callback

Keras 2.0 removed F1 score, but I would like to monitor its value. I am using a sequential model to train a Neural Net.
I defined a function, as suggested here How to calculate F1 Macro in Keras?.
This function works fine only if used it inside model.compile. In this way I see its value at each step. The problem is that I don't want just to see its value but I would like my training to behave differently according to its value, using the callbacks of Keras.
If I try to insert my custom metric in the callbacks then I get this error:
'function object is not iterable'
Do you know how to define a function such that it can be used as an argument in the callbacks?
Callback of Keras will enable us to retrieve the model at different period, based on the metric which we keep track of. This will not affect the training procedure of the model.
You can train your model only with respect to some loss function. For example, cross entropy for classification problem. The readily available loss function in keras are given here
Precision, recall or f1-score are not differentialable functions. Hence, we cannot use that as a loss function for model training.
May be, if you want to tune your hyperparameter (such as learning rate, class weights) for improving f1 score, then you can be do that.
For tuning hyper parameters you can use hyperopt, tutorials

Hyperparameters tuning with Google Cloud ML Engine and XGBoost

I am trying to replicate the hyperparameter tuning example reported at this link but I want to use scikit learn XGBoost instead of tensorflow in my training application.
I am able to run multiple trials in a single job, on for each of the hyperparameters combination. However, the Training output object returned by ML-Engine does not include the finalMetric field, reporting metric information (see the differences in the picture below).
What I get with the example of the link above:
Training output object with Tensorflow training app
What I get running my Training application with XGBoost:
Training output object with XGBoost training app
Is there a way for XGBoost to return training metrics to ML-Engine?
It seems that this process is automated for tensorflow, as specified in the documentation:
How Cloud ML Engine gets your metric
You may notice that there are no instructions in this documentation
for passing your hyperparameter metric to the Cloud ML Engine training
service. That's because the service monitors TensorFlow summary events
generated by your training application and retrieves the metric.
Is there a similar mechanism for XGBoost?
Now, I can always dump each metric results to a file at the end of each trial and then analyze them manually to select the best parameters. But, by doing so, am I loosing the automated mechanism offered by Cloud ML Engine, especially concerning the "ALGORITHM_UNSPECIFIED" hyperparameters search algorithm?
i.e.,
ALGORITHM_UNSPECIFIED: [...] applies Bayesian optimization to search
the space of possible hyperparameter values, resulting in the most
effective technique for your set of hyperparameters.
Hyperparameter tuning support of XGBoost was implemented in a different way. We created the cloudml-hypertune python package to help do it. We're still working on the public doc for it. At the meantime, you can refer to this staging sample to learn about how to use it.
Sara Robinson over at google put together a good post on how to do this. Rather than regurgitate and claim it as my own, I'll post this here for anyone else that comes across this post:
https://sararobinson.dev/2019/09/12/hyperparameter-tuning-xgboost.html

How to use cross-validation splitted data with RandomizedSearchCV

I'm trying to transfer my model from single run to hyper-parameter tuning using RandomizedSearchCV.
In my single run case, my data is splitted into train/validation/test data.
When I run RandomizedSearchCV on my train_data with default 3-fold CV, I notice that the length of my train_input is reduced to 66% of train_data (which makes sense in a 3-fold CV...).
So I'm guessing that I should merge my initial train and validation set into a larger train set and let RandomizedSearchCV split it into train and validation sets.
Would that be the right way to go?
My question is: how can I access the remaining 33% of my train_input to feed it to my validation accuracy test function (note that my score function is running on test set)?
Thanks for your help!
Yoann
I'm not sure that my code would help here since my question is rather generic.
This is the answer that I found by going through sklearn's code: the RandomizedSearchCV doesn't return the splited validation data in an easy way and I should definitely merge my initial train and validation set into a larger train set and let RandomizedSearchCV split it into train and validation sets.
The train_data is splitted for CV using a cross-validator into a train/validation set (in my case, the Stratified K-Folds http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.StratifiedKFold.html)
My estimator is defined as follows:
class DNNClassifier(BaseEstimator, ClassifierMixin):
It needs a score function to be able to evaluate the CV performance on the validation set. There is a default score function defined in the ClassifierMixin class (which returns the the mean accuracy and requires a predict function to be implemented in the Estimator class).
In my case, I implemented a custom score function within my estimator class.
The hyperparameter search and CV fit is done calling the fit function of RandomizedSearchCV.
RandomizedSearchCV(DNNClassifier(), param_distribs).fit(train_data)
This fit function runs the estimator's custom fit function on the train set and then the score function on the validation set.
This is done using the _fit_and_score function from the ._validation library.
So I can access the automatically splitted validation set (33% of my train_data input) at the end of my estimator's fit function.
I'd have preferred to access it within my estimator's fit function so that I can use it to plot validation accuracy over training steps and for early stop (I'll keep a separate validation set for that).
I guess I could reconstruct the automatically generated validation set by looking for the missing indexes from my initial train_data (the train_data used in the estimator's fit function has 66% of the indexes of the initial train_data).
If that is something that someone has already done I'd love to hear about it!

Pytorch: Intermediate testing during training

How can I test my pytorch model on validation data during training?
I know that there is the function myNet.eval() which apparantly switches of any dropout layers, but is it also preventing the gradients from being accumulated?
Also how would I undo the myNet.eval() command in order to continue with the training?
If anyone has some code snippet / toy example I would be grateful!
How can I test my pytorch model on validation data during training?
There are plenty examples where there are train and test steps for every epoch during training. An easy one would be the official MNIST example. Since pytorch does not offer any high-level training, validation or scoring framework you have to write it yourself. Commonly this consists of
a data loader (commonly based on torch.utils.dataloader.Dataloader)
a main loop over the total number of epochs
a train() function that uses training data to optimize the model
a test() or valid() function to measure the effectiveness of the model given validation data and a metric
This is also what you will find in the linked example.
Alternatively you can use a framework that provides basic looping and validation facilities so you don't have to implement everything by yourself all the time.
tnt is torchnet for pytorch, supplying you with different metrics (such as accuracy) and abstraction of the train loop. See this MNIST example.
inferno and torchsample attempt to model things very similar to Keras and provide some tools for validation
skorch is a scikit-learn wrapper for pytorch that lets you use all the tools and metrics from sklearn
Also how would I undo the myNet.eval() command in order to continue with the training?
myNet.train() or, alternatively, supply a boolean to switch between eval and training: myNet.train(True) for train mode.
I know that there is the function myNet.eval() which apparantly switches of any dropout layers, but is it also preventing the gradients from being accumulated?
It doesn't prevent gradients from accumulating.
But I think during testing, you do want to ignore gradients. In that case, you should mark the variable input to the network as volatile=True, and it will save some time and space used in forward calculation.
Also how would I undo the myNet.eval() command in order to continue with the training?
myNet.train()

Resources