Keras LSTM layers in Keras-rl - keras

I am trying to implement a DQN agent using Keras-rl. The problem is that when I define my model I need to use an LSTM layer in the architecture:
model = Sequential()
model.add(Flatten(input_shape=(1, 8000)))
model.add(Reshape(target_shape=(200, 40)))
model.add(LSTM(20))
model.add(Dense(3, activation='softmax'))
return model
Executing the rl-agent I obtain the following error:
RuntimeError: Attempting to capture an EagerTensor without building a function.
Which is related to the use of the LSTM and to the following line of code:
tf.compat.v1.disable_eager_execution()
Using a Dense layer instead of an LSTM:
model = Sequential()
model.add(Flatten(input_shape=(1, 8000)))
model.add(Dense(20))
model.add(Dense(3, activation='softmax'))
return model
and maintaining eager execution disabled I don't have the previously reported error. If I delete the disabling of the eager execution with the LSTM layer I have other errors.
Can anyone help me to understand the reason of the error?

The keras-rl library does not have explicit support for TensorFlow 2.0, so it will not work with such version of TensorFlow. The library is sparsely updated and the last release is around 2 years old (from 2018), so if you want to use it you should use TensorFlow 1.x

Install keras-rl2 from github support tensorflow 2.x

Although it is possible to migrate the code for keras-rl to use eager execution and therefore an LSTM. LSTMs need to be updated with a whole episode of learning to prove accurate something which keras-rl does not support. See more here: https://github.com/keras-rl/keras-rl/issues/41

Related

convert tensorflow.keras model to keras model

I have an EfficientNet model (tensorflow.keras==2.4) and would like to use innvestigate to inspect the results, but it requires keras==2.2.4
Training code:
tensorflow.keras.__version__ # 2.4
model = tf.keras.applications.EfficientNetB1(**params)
# do training
model.save('testModel')
I have the model saved as file but can not load it into Keras 2.2.4. This is the point where I'm stuck, I couldn't figure out what to do to convert the model.
Use Innvestigate:
keras.__version__ # 2.2.4
keras.model.load_model('testModel') # Error
# some more stuff...
I also found this thread, might try it, but since efficient net has > 350 layers it is not really applicable
How to load tf.keras models with keras
I don't know if it's actually possible to convert models between tensorflow.keras and keras, I appreciate all help I can get.
Due to version incompatibility between tensorflow as keras, you were not able to load model.
Your issue will be resolved, once you upgrade keras and tensorflow to 2.5.

Porting pre-trained keras models and run them on IPU

I am trying to port two pre-trained keras models into the IPU machine. I managed to load and run them using IPUstrategy.scope but I dont know if i am doing it the right way. I have my pre-trained models in .h5 file format.
I load them this way:
def first_model():
model = tf.keras.models.load_model("./model1.h5")
return model
After searching your ipu.keras.models.py file I couldn't find any load methods to load my pre-trained models, and this is why i used tf.keras.models.load_model().
Then i use this code to run:
cfg=ipu.utils.create_ipu_config()
cfg=ipu.utils.auto_select_ipus(cfg, 1)
ipu.utils.configure_ipu_system(cfg)
ipu.utils.move_variable_initialization_to_cpu()
strategy = ipu.ipu_strategy.IPUStrategy()
with strategy.scope():
model = first_model()
print('compile attempt\n')
model.compile("sgd", "categorical_crossentropy", metrics=["accuracy"])
print('compilation completed\n')
print('running attempt\n')
res = model.predict(input_img)[0]
print('run completed\n')
you can see the output here:link
So i have some difficulties to understand how and if the system is working properly.
Basically the model.compile wont compile my model but when i use model.predict then the system first compiles and then is running. Why is that happening? Is there another way to run pre-trained keras models on an IPU chip?
Another question I have is if its possible to load a pre-trained keras model inside an ipu.keras.model and then use model.fit/evaluate to further train and evaluate it and then save it for future use?
One last question I have is about the compilation part of the graph. Is there a way to avoid recompilation of the graph every time i use the model.predict() in a different strategy.scope()?
I use tensorflow2.1.2 wheel
Thank you for your time
To add some context, the Graphcore TensorFlow wheel includes a port of Keras for the IPU, available as tensorflow.python.ipu.keras. You can access the API documentation for IPU Keras at this link. This module contains IPU-specific optimised replacement for TensorFlow Keras classes Model and Sequential, plus more high-performance, multi-IPU classes e.g. PipelineModel and PipelineSequential.
As per your specific issue, you are right when you mention that there are no IPU-specific ways to load pre-trained Keras models at present. I would encourage you, as you appear to have access to IPUs, to reach out to Graphcore Support. When doing so, please attach your pre-trained Keras model model1.h5 and a self-contained reproducer of your code.
Switching topic to the recompilation question: using an executable cache prevents recompilation, you can set that up with environmental variable TF_POPLAR_FLAGS='--executable_cache_path=./cache'. I'd also recommend to take a look into the following resources:
this tutorial gathers several considerations around recompilation and how to avoid it when using TensorFlow2 on the IPU.
Graphcore TensorFlow documentation here explains how to use the pre-compile mode on the IPU.

using keras h5 weights in tf.keras model

I have h5 weights from a Keras model.
I want to rewrite the Keras model into a tf.keras model (using TF2.x).
I know that only the high level API changed, but do you know if I still can use the h5 weights?
Most likely they can be loaded, but is the structure different between Keras and tf.keras weights?
Thanks
It seems that they are the same
cudos to Mohsin hasan answer
In the past, when I had to convert tf.keras model to keras model, I
did following:
Train model in tf.keras
Save only the weights tf_model.save_weights("tf_model.hdf5")
Make Keras model architecture using all layers in keras (same as the tf keras one)
load weights by layer names in keras: keras_model.load_weights(by_name=True)
This seemed to work for me. Since, I was using out of box architecture
(DenseNet169), I had to very less work to replicate tf.keras network
to keras.
And the answer from Alex Cohn
tf.keras HDF5 model and Keras HDF5 models are not different things,
except for inevitable software version update synchronicity. This is
what the official docs say:
tf.keras is TensorFlow's implementation of the Keras API specification. This is a high-level API to build and train models that
includes first-class support for TensorFlow-specific functionality
If the convertor can convert a keras model to tf.lite, it will deliver
same results. But tf.lite functionality is more limited than tf.keras.
If this feature set is not enough for you, you can still work with
tensorflow, and enjoy its other advantages.

How to add CRF layer in a tensorflow sequential model?

I am trying to implement a CRF layer in a TensorFlow sequential model for a NER problem. I am not sure how to do it. Previously when I implemented CRF, I used CRF from keras with tensorflow as backend i.e. I created the entire model in keras instead of tensorflow and then passed the entire model through CRF. It worked.
But now I want to develop the model in Tensorflow as tensorflow2.0.0 beta already has keras inbuilt in it and I am trying to build a sequential layer and add CRF layer after a bidirectional lstm layer. Although I am not sure how to do that. I have gone through the CRF documentation in tensorflow-addons and it contains different functions such as forward CRF etc etc but not sure how to implement them as a layer ? I am wondering is it possible at all to implement a CRF layer inside a sequential tensorflow model or do I need to build the model graph from scratch and then use CRF functions ? Can anyone please help me with it. Thanks in advance
In the training process:
You can refer to this API:
tfa.text.crf_log_likelihood(
inputs,
tag_indices,
sequence_lengths,
transition_params=None
)
The inputs are the unary potentials(just like that in the logistic regression, and you can refer to this answer) and here in your case, they are the logits(it is usually not the distributions after the softmax activation function) or states of the BiLSTM for each character in the encoder(P1, P2, P3, P4 in the diagram above; ).
The tag_indices are the target tag indices, and the sequence_lengths represent the sequence lengths in a batch.
The transition_params are the binary potentials(also how the tag transits from one time step to the next), you can create the matrix yourself or you just let the API do it for you.
In the inference process:
You just utilize this API:
tfa.text.viterbi_decode(
score,
transition_params
)
The score stands for the same input like that in the training(the P1, P2, P3, P4 states) and the transition_params are also that trained in the training process.

loading weights keras LSTM not working

I am trying to load the weights from a Keras 1.0 Model into a Keras 2.0 model I created. I am sure the model architecture is exactly the same. The issues I am having is the load_weights() function is loading all the weights.
When I print the weights to a text file from the original model (loaded via load_model) and from the new model with load_weights() the later is missing many entry and are actually different. This also shows itself when making predictions as the accuracy is lower.
This problem only occurs in my LSTM layers. The embedding layers is fine and the Dense layer is also fine.
Any thoughts? I can not use load_model() as the original saved model was done in keras 1.0 and I need to use keras 2.0
EDIT MORE:
I should note I think the issue is the internal states not being loaded. Let me explain though. When I use get_weights() on each layer and I print it too terminal or a file the original model outputs a much larger matrix.
After using load_weights and then get_weights and print the weight matrix is missing many elements. I'm thinking it's the internal states.
The problem was that there was parameters for a compiled graph that were saved. I think it's safe to just port over the weights and continue training to let it catch up (maybe 1-2 epochs) if you can.
Gl

Resources