Undeprecating tensorflow - python-3.x

When making a DNN regressor and predicting the values by
print(list(estimator.predict({"p": np.array([[0.,0.],[1.,0.],[0.,1.],[1.,1.]])})))
this is the output of the console:
WARNING:tensorflow:From "...\tensorflow\contrib\learn\python\learn\estimators\dnn.py":692: calling BaseEstimator.predict (from tensorflow.contrib.learn.python.learn.estimators.estimator) with x is deprecated and will be removed after 2016-12-01.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
est = Estimator(...) -> est = SKCompat(Estimator(...))
So I head into line 692 of dnn.py and this is what I find
preds = super(DNNRegressor, self).predict(
x=x,
input_fn=input_fn,
batch_size=batch_size,
outputs=[key],
as_iterable=as_iterable)
So following the advice from the error, and assuming that super(DNNRegressor, self) is an Estimator I've just did
preds = estimator.SKCompat(super(DNNRegressor, self)).predict(...)
But doing that I get
TypeError: predict() got an unexpected keyword argument 'input_fn'
that looks like it's not a tensorflow error.
The problem is I don't know how to get rid of the warning (not an error).

This portion of the Github tree is under active development. I expect this warning message to go away once the Estimator class is moved into tf.core which is schedule for version r1.1. I found the 2017 TensorFlow Dev Summit video by Martin Wicke to be very informative on the future plans of high level TensorFlow.

Related

ValueError: Tensor Tensor("dense_4/Sigmoid:0", shape=(?, 1025), dtype=float32) is not an element of this graph

Today I suddenly started getting this error for no apparent reason, while I was running model.fit(). This used to work before, I am using TF 2.3.0, more specifically its Keras module.
The function is called on validation inside a generator, which is fed into model.predict().
Basically, I load a checkpoint, I resume training the network, and I make a prediction on validation.
The error keeps occurring even when training a model from scratch, and erasing all the related data. It's like if something has been hardcoded, somewhere, as I was able to run model.fit() up until a few hours ago.
I saw several solutions like THIS, but none of these variations really work for me, as they lead to more tricky error messages.
I even tried installing a different version of TF, thinking that this was due to some old version, but the error still occurs.
I will answer my own question, as this one was particularly tricky and none of the solutions I found on the internet has worked for me, probably because outdated.
I'll write down just the relevant part to add in the code, feel free to add more technical explanations.
I like using args for passing variables, but it can work without:
from tensorflow.python.keras.backend import set_session
from tensorflow.keras.models import load_model
import generator # custom generator
def main(args):
# open new session and define TF graph
args.sess = tf.compat.v1.Session()
args.graph = tf.compat.v1.get_default_graph()
set_session(args.sess)
# define training generator
train_generator = generator(args.train_data)
# load model
args.model = load_model(args.model_path)
args.model.fit(train_generator)
Then, in the model prediction function:
# In my specific case, the predict_output() function is
# called inside the generator function
def predict_output(args, x):
with args.graph.as_default():
set_session(args.sess)
y = model.predict(x)
return y

Error is Keras when training with MirroredStrategy

When I use MirroredStrategy to train my model in Keras I get an error which I do not receive when not using MirroredStrategy. Here is some sample code
# Create a MirroredStrategy.
strategy = tf.distribute.MirroredStrategy()
print('Number of devices: {}'.format(strategy.num_replicas_in_sync))
# Open a strategy scope.
with strategy.scope():
# Everything that creates variables should be under the strategy scope.
# In general this is only model construction & `compile()`.
model = Model(...)
model.compile(optimizer=opt, loss=['mean_absolute_error', 'mean_absolute_error'], loss_weights = [l1,l2])
# Train the model on all available devices.
model.fit(train_dataset, validation_data=val_dataset, ...)
# Test the model on all available devices.
model.evaluate(test_dataset)
The error that I receive is TypeError: Input 'y' of 'Equal' Op has type variant that does not match type float32 of argument 'x'.
I believe this error has to do with the loss function. It is important to note that I have 1 input and 2 outputs for my model.
Seems that upgrading tensorflow 2.0 fixed the issue. Currently using the latest release.

Numpy arrays used in training in TF1--Keras have much lower accuracy in TF2

I had a neural net in keras that performed well. Now with the deprecation that came with Tensorflow 2 I had to rewrite the model. Now it is giving me worse accuracy metrics.
My suspicion is that tf2 wants you to use their data structure to train models and they give a example of how to go from Numpy to tf.data.Dataset here.
So I did:
train_dataset = tf.data.Dataset.from_tensor_slices((X_train_deleted_nans, y_train_no_nans))
train_dataset = train_dataset.shuffle(SHUFFLE_CONST).batch(BATCH_SIZE)
Once the training starts I get this warning error:
2019-10-04 23:47:56.691434: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: End of sequence
[[{{node IteratorGetNext}}]]
Appending .repeat() to the creation of my tf.data.Dataset solved my error. Like suggested by duysqubix in his eloquent solution posted here:
https://github.com/tensorflow/tensorflow/issues/32817#issuecomment-539200561

TensorFlow - Shape Mismatch

I have just started with TensorFlow. I was checking out the MusicGenerator available at https://github.com/Conchylicultor/MusicGenerator
I am getting error:
ValueError: Trying to share variable
rnn_decoder/KeyboardCell/Decoder/multi_rnn_cell/cell_0/basic_lstm_cell/kernel,
but specified shape (1024, 2048) and found shape (525, 2048).
I think this might be due to some variable shared between the encoder and decoder of the cell. The main code was written for tensorflow 0.10.0 but I am trying to run on Tensorflow 1.3
def __call__(self, prev_keyboard, prev_state, scope=None):
""" Run the cell at step t
Args:
prev_keyboard: keyboard configuration for the step t-1 (Ground truth or previous step)
prev_state: a tuple (prev_state_enco, prev_state_deco)
scope: TensorFlow scope
Return:
Tuple: the keyboard configuration and the enco and deco states
"""
# First time only (we do the initialisation here to be on the global rnn loop scope)
if not self.is_init:
with tf.variable_scope('weights_keyboard_cell'):
# TODO: With self.args, see which network we have chosen (create map 'network name':class)
self.encoder.build()
self.decoder.build()
prev_state = self.encoder.init_state(), self.decoder.init_state()
self.is_init = True
# TODO: If encoder act as VAE, we should sample here, from the previous state
# Encoder/decoder network
with tf.variable_scope(scope or type(self).__name__):
with tf.variable_scope('Encoder'):
# TODO: Should be enco_output, enco_state
next_state_enco = self.encoder.get_cell(prev_keyboard, prev_state)
with tf.variable_scope('Decoder'): # Reset gate and update gate.
next_keyboard, next_state_deco = self.decoder.get_cell(prev_keyboard, (next_state_enco, prev_state[1]))
return next_keyboard, (next_state_enco, next_state_deco)
I am completely new to RNNs and CNNs . I have been reading a bit about it as well, understood in a high level way on how these work. And understood how some parts of the code is actually working in training, modelling. But I don't think enough to debug this. And especially because I am a bit confused with the tensorflow API as well.
It would be great why this might be happenning, what I can do to fix it. And also if you can point me to some books for CNN, RNN, Back propogation and how to use tensor flow effectively to build things.

How to calculate the mean training score using GridSearchCV in Scikit-Learns

I would like to plot the mean validation vs mean training score for Linear Support Vector machine in a similar fashion as done here: http://youtu.be/9qg9__n4X2A?t=20m33s
However when running similar code the parameter compute_training_scores does not seem to exist.
Also this parameter is not documented [1]. I checked the current master branch on Github and it does not seem to be committed yet.
I am using Scikit-learn 0.14.1
I am a bit confused here. Is there branch or tag that I need in order to get the same functionality or is there a alternative way to calculate this?
The code in question:
param_grid = {'C': 10. ** np.arange(-3, 4)}
grid_search = GridSearchCV(svm, param_grid=param_grid, cv=3, verbose=3, compute_training_score=True)
grid_search.fit(X_train, y_train);
plt.plot([c.mean_validation_score for c in grid_search.cv_scores_], label="validation error")
plt.plot([c.mean_training_score for c in grid_search.cv_scores_], label="training error")
plt.xticks(np.arange(6), param_grid['C']); plt.xlabel("C"); plt.ylabel("Accuracy");plt.legend(loc='best');
If I run the same code without the offending parameter I get:
AttributeError: '_CVScoreTuple' object has no attribute 'mean_training_score'
[1] http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html
mean_validation_score and mean_training_score will be available in the next scikit-learn release, 0.15. You need to install from GitHub to get it.

Resources