I'm trying to get the same functionality of pytorch lstm using pytorch lstm cell.
I've read this: Pytorch LSTM vs LSTMCell
but not sure how it can be done
i.e i need to use the same api of lstm so maybe need to create a method or class for it?
can you assist?
Related
I have h5 weights from a Keras model.
I want to rewrite the Keras model into a tf.keras model (using TF2.x).
I know that only the high level API changed, but do you know if I still can use the h5 weights?
Most likely they can be loaded, but is the structure different between Keras and tf.keras weights?
Thanks
It seems that they are the same
cudos to Mohsin hasan answer
In the past, when I had to convert tf.keras model to keras model, I
did following:
Train model in tf.keras
Save only the weights tf_model.save_weights("tf_model.hdf5")
Make Keras model architecture using all layers in keras (same as the tf keras one)
load weights by layer names in keras: keras_model.load_weights(by_name=True)
This seemed to work for me. Since, I was using out of box architecture
(DenseNet169), I had to very less work to replicate tf.keras network
to keras.
And the answer from Alex Cohn
tf.keras HDF5 model and Keras HDF5 models are not different things,
except for inevitable software version update synchronicity. This is
what the official docs say:
tf.keras is TensorFlow's implementation of the Keras API specification. This is a high-level API to build and train models that
includes first-class support for TensorFlow-specific functionality
If the convertor can convert a keras model to tf.lite, it will deliver
same results. But tf.lite functionality is more limited than tf.keras.
If this feature set is not enough for you, you can still work with
tensorflow, and enjoy its other advantages.
I want to extract features using a pretrained CNN model(ResNet50, VGG, etc) and use the features with a CTC loss function.
I want to build it as a text recognition model.
Anyone on how can i achieve this ?
I'm not sure if you are looking to finetune the pretrained models or to use the models for feature extraction. To do the latter freeze the petrained model weights (there are several ways to do this in PyTorch, the simplest being calling .eval() on the model), and feed the logits from the last layer of the model to your new output head. See the PyTorch tutorial here for a more in depth guide.
I am interested in training a model in tf.keras and then loading it with keras. I know this is not highly-advised, but I am interested in using tf.keras to train the model because
tf.keras is easier to build input pipelines
I want to take advantage of the tf.dataset API
and I am interested in loading it with keras because
I want to use coreml to deploy the model to ios.
I want to use coremltools to convert my model to ios, and coreml tools only works with keras, not tf.keras.
I have run into a few road-blocks, because not all of the tf.keras layers can be loaded as keras layers. For instance, I've had no trouble with a simple DNN, since all of the Dense layer parameters are the same between tf.keras and keras. However, I have had trouble with RNN layers, because tf.keras has an argument time_major that keras does not have. My RNN layers have time_major=False, which is the same behavior as keras, but keras sequential layers do not have this argument.
My solution right now is to save the tf.keras model in a json file (for the model structure) and delete the parts of the layers that keras does not support, and also save an h5 file (for the weights), like so:
model = # model trained with tf.keras
# save json
model_json = model.to_json()
with open('path_to_model_json.json', 'w') as json_file:
json_ = json.loads(model_json)
layers = json_['config']['layers']
for layer in layers:
if layer['class_name'] == 'SimpleRNN':
del layer['config']['time_major']
json.dump(json_, json_file)
# save weights
model.save_weights('path_to_my_weights.h5')
Then, I use the coremlconverter tool to convert from keras to coreml, like so:
with CustomObjectScope({'GlorotUniform': glorot_uniform()}):
coreml_model = coremltools.converters.keras.convert(
model=('path_to_model_json','path_to_my_weights.h5'),
input_names=#inputs,
output_names=#outputs,
class_labels = #labels,
custom_conversion_functions = { "GlorotUniform": tf.keras.initializers.glorot_uniform
}
)
coreml_model.save('my_core_ml_model.mlmodel')
My solution appears to be working, but I am wondering if there is a better approach? Or, is there imminent danger in this approach? For instance, is there a better way to convert tf.keras models to coreml? Or is there a better way to convert tf.keras models to keras? Or is there a better approach that I haven't thought of?
Any advice on the matter would be greatly appreciated :)
Your approach seems good to me!
In the past, when I had to convert tf.keras model to keras model, I did following:
Train model in tf.keras
Save only the weights tf_model.save_weights("tf_model.hdf5")
Make Keras model architecture using all layers in keras (same as the tf keras one)
load weights by layer names in keras: keras_model.load_weights(by_name=True)
This seemed to work for me. Since, I was using out of box architecture (DenseNet169), I had to very less work to replicate tf.keras network to keras.
I am setting up a fit_generator to train a DNN by keras. But don't know how to use a CNN inside this generator.
Basically, I have a pre-trained image generator using fully-connected convolutional networks (we can named it as GEN-NET). Now I want to used this Fully-CNN in my fit_generator to generate unlimited number of images to train another classifier (called CLASS-NET) in keras. But it always crashed my training and the error message is:
ValueError: Tensor Tensor("decoder/transform_output/mul:0", shape=(?, 128, 128, 1), dtype=float32) is not an element of this graph.
This "decoder/transform_output/mul:0" is the output of my CNN GEN-NET.
So my question is that can I use CNN based GEN-NET in my fit_generator to train GLASS-NET or it is not permitted in keras?
Keras does not really like running two separate models in a single session. You could use K.clear_session() after using the model but this would produce a lot of overhead!
Best way to do this, IMHO, is by pre-generating these images and then loading them using a generator. Basically splitting your program into two separate programs.
Otherwise, if you are using tensorflow as back-end there might be a way to do it by switching the default graph on the tf.Session, you could Google that but I would not recommend it! :)
Seems like you might have things a bit mixed up! The CNN (convolutional neural network) needs to be trained to your data, unless you're using a pretrained network for predictions. If you're going to train the CNN, you can do that with either the fit() or the fit_generator() function. Use fit() if you're feeding data directly, and use fit_generator() if your data is handled by Image Data Generators. If you've loaded a pre-trained model/weights only to make predictions, you don't need to use any fit function, since no training needs to be done.
I would like to visualize gradients of a seq2seq model using Keras Tensorboard callback. If I'm using a regular LSTM cell in my encoder and decoder, I get nice non-zero gradients:
However if I change the rnn cell to CuDNNLSTM some gradients turn to zero, which seem to be incorrect:
The both models seem to train correctly.
So, what's wrong with visualisations of CuDNNLSTM gradients? Is there a bug in Keras Tensorboard callback?
Code that I am running is a slightly modified Keras lstm_seq2seq example: https://gist.github.com/nicolas-ivanov/1818d6502d5f1496e5fbe14889eddca1