Getting Errors while running elmo embeddings in google colab - python-3.x

I am extracting features through elmo. Train and Test are text data.I am getting errors while executing in google colab. I have checked previous Stackoverflow questions but could not resolve. Exact codes with pointers will be helpful.
elmo = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
def elmo_vectors(x):
embeddings = elmo(x.tolist(), signature="default", as_dict=True)["elmo"]
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.tables_initializer())
# return average of ELMo features
return sess.run(tf.reduce_mean(embeddings,1))
import tensorflow as tf
import tensorflow_hub as hub
list_train = [train[i:i+100] for i in range(0,train.shape[0],100)]
list_test = [test[i:i+100] for i in range(0,test.shape[0],100)]
# Extract ELMo embeddings
elmo_train = [elmo_vectors(x['clean_tweet']) for x in list_train]
elmo_test = [elmo_vectors(x['clean_tweet']) for x in list_test]
I am getting following errors:
UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[node module_apply_default_1/bilm/CNN_2/Conv2D_6 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_hub/native_module.py:517) ]]
[[node Mean (defined at :8) ]]

I tried right now on colab.research.google.com in Python 3 runtimes with and without GPU, and the following adaptation of your code runs:
import tensorflow as tf
import tensorflow_hub as hub
elmo = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
def elmo_vectors(x):
embeddings = elmo(x, # Note plain x here.
signature="default", as_dict=True)["elmo"]
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.tables_initializer())
# return average of ELMo features
return sess.run(tf.reduce_mean(embeddings, 1))
elmo_vectors(["Hello world"])
I get the output:
array([[ 0.45319763, -0.99154925, -0.26539633, ..., -0.13455263,
0.48878008, 0.31264588]], dtype=float32)
I believe this is a not a TF Hub problem.

This happened with me as well. I think it is about the free RAM usage we get in the Google Colab Free tier version. I had to reduce a few Convolution layers and reduce the batch size in order to run it. Also, I was not able to run it on GPU.
So, I guess you can consider using Google Colab Pro

Related

keras loss function(from keras input)

I reference the link: Keras custom loss function: Accessing current input pattern.
But I get error: " TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model. "
This is the source code: What happened ?
def custom_loss_wrapper(input_tensor):
def custom_loss(y_true, y_pred):
return K.binary_crossentropy(y_true, y_pred) + K.mean(input_tensor)
return custom_loss
input_tensor = Input(shape=(10,))
hidden = Dense(100, activation='relu')(input_tensor)
out = Dense(1, activation='sigmoid')(hidden)
model = Model(input_tensor, out)
model.compile(loss=custom_loss_wrapper(input_tensor), optimizer='adam')
X = np.random.rand(1000, 10)
y = np.random.rand(1000, 1)
model.train_on_batch(X, y)
In the tf 2.0, eager mode is on by default. It's not possible to get this functionality in eager mode as the above example is currently written. I think there are ways to do it in eager mode with some more advanced programming. But otherwise it's a simple matter to turn eager mode off and run in graph mode with:
from tensorflow.python.framework.ops import disable_eager_execution
disable_eager_execution()
When I received a similar error, I performed the following:
del model
Before:
model = Model(input_tensor, out)
It resolved my issue, you can give it a shot.
I am eager to know if it solves your problem :) .

How to obtain the Jacobian Matrix with respect to the inputs of a keras model neural network?

I recently started learning and using automatic differentiation to determine the gradients and jacobian matrix of a neural network with respect to a given input. The method suggested by tensorflow is the tape.gradient and tape.jacobian method. However, I am not able to obtain the jacobian matrix using this method due to some bug in tensorflow. It works when I calculated tape.gradient(y_pred, x), but not the jacobian matrix, which should have a shape of (200,3). I am open to other ways to calculate the jacobian matrix, but I am more inclined to use automatic differentiation methods within Tensorflow. The current version I am using is Tensorflow 2.1.0. Greatly appreciate any advice!
import tensorflow as tf
import numpy as np
# The neural network accepts 3 inputs and produces 200 outputs. The actual values of the inputs and outputs are not written in the code as it is too involved.
num_inputs = 3
num_outputs = 200
num_hidden_layers = 5
num_neurons = 50
kernel = 'he_uniform'
activation = tf.keras.layers.LeakyReLU(alpha=0.3)
# Details of model (MLP)
current_model = tf.keras.models.Sequential()
current_model.add(tf.keras.Input(shape=(num_inputs,)))
for i in range(num_hidden_layers):
current_model.add(tf.keras.layers.Dense(units=num_neurons, activation=activation, kernel_initializer=kernel))
current_model.add(tf.keras.layers.Dense(units=num_outputs, activation='linear', kernel_initializer=kernel))
# Finding the Jacobian matrix with respect to a given input of the neural network
# In this case, the inputs are [0.02, 0.4 and 0.12] (i.e. 3 inputs)
x = tf.Variable([[0.02, 0.4, 0.12]], dtype=tf.float32)
with tf.GradientTape() as tape:
y_pred = x
for layer in current_model.layers:
y_pred = layer(y_pred)
jacobian = tape.jacobian(y_pred, x)
print(jacobian)
Below is the error returned. I removed some parts for privacy purposes.
StagingError: in converted code:
C:\Users\...\anaconda3\envs\tf\lib\site-packages\tensorflow_core\python\ops\parallel_for\control_flow_ops.py:183 f *
return _pfor_impl(loop_fn, iters, parallel_iterations=parallel_iterations)
C:\Users\...\anaconda3\envs\tf\lib\site-packages\tensorflow_core\python\ops\parallel_for\control_flow_ops.py:256 _pfor_impl
outputs.append(converter.convert(loop_fn_output))
C:\Users\...\anaconda3\envs\tf\lib\site-packages\tensorflow_core\python\ops\parallel_for\pfor.py:1280 convert
output = self._convert_helper(y)
C:\Users\...\anaconda3\envs\tf\lib\site-packages\tensorflow_core\python\ops\parallel_for\pfor.py:1453 _convert_helper
if flags.FLAGS.op_conversion_fallback_to_while_loop:
C:\Users\...\anaconda3\envs\tf\lib\site-packages\tensorflow_core\python\platform\flags.py:84 __getattr__
wrapped(_sys.argv)
C:\Users\...\anaconda3\envs\tf\lib\site-packages\absl\flags\_flagvalues.py:633 __call__
name, value, suggestions=suggestions)
UnrecognizedFlagError: Unknown command line flag 'f'

Question about Permutation Importance on LSTM Keras

from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
import eli5
from eli5.sklearn import PermutationImportance
model = Sequential()
model.add(LSTM(units=30,return_sequences= True, input_shape=(X.shape[1],421)))
model.add(Dropout(rate=0.2))
model.add(LSTM(units=30, return_sequences=True))
model.add(LSTM(units=30))
model.add(Dense(units=1, activation='relu'))
perm = PermutationImportance(model, scoring='accuracy',random_state=1).fit(X, y, epochs=500, batch_size=8)
eli5.show_weights(perm, feature_names = X.columns.tolist())
I am running an LSTM just to see the feature importance of my dataset containing 400+ features. I used the Keras scikit-learn wrapper to use eli5's PermutationImportance function. But the code is returning
ValueError: Found array with dim 3. Estimator expected <= 2.
The code runs smoothly if I use model.fit() but can't debug the error of the permutation importance. Anyone know what is wrong?
eli5's scikitlearn implementation for determining permutation importance can only process 2d arrays while keras' LSTM layers require 3d arrays. This error is a known issue but there appears to be no solution yet.
I understand this does not really answer your question of getting eli5 to work with LSTM (because it currently can't), but I encountered the same problem and used another library called SHAP to get the feature importance of my LSTM model. Here is some of my code to help you get started:
import shap
DE = shap.DeepExplainer(model, X_train) # X_train is 3d numpy.ndarray
shap_values = DE.shap_values(X_validate_np, check_additivity=False) # X_validate is 3d numpy.ndarray
shap.initjs()
shap.summary_plot(
shap_values[0],
X_validate,
feature_names=list_of_your_columns_here,
max_display=50,
plot_type='bar')
Here is an example of the graph which you can get:
Hope this helps.

Accessing a pretrained model's weights (names and values) in TensorFlow

Quantitative analysis of Neural Networks needs load all variables first in
TensorFlow. So I import the SSDLite MobileNet v2 model's checkpoint and restore weights. But I can not get its weights or variables by using some functions, such as get_all_coolection_keys, get_collection. The code snippet as follows:
import tensorflow as tf
import numpy as np
model_dir = 'ssdlite_mobilenet_v2_coco_2018_05_09'
checkpoint = tf.train.get_checkpoint_state(model_dir)
input_checkpoint = checkpoint.model_checkpoint_path # ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt
sess = tf.Session()
tf.reset_default_graph()
saver = tf.train.import_meta_graph(input_checkpoint + '.meta')
graph = tf.get_default_graph()
saver.restore(sess, input_checkpoint)
# INFO:tensorflow:Restoring parameters from ssdlite_mobilenet_v2_coco_2018_05_09/model.ckpt
graph_collections_keys = graph.get_all_collection_keys()
print(graph_collections_keys) # []
hyperparameters = tf.get_collection('hyperparameters')
print(len(hyperparameters)) # 0
model_variables = tf.get_collection(tf.GraphKeys.MODEL_VARIABLES)
print(len(model_variables)) # 0
So, how to access a pretrained model's weights (names and values) in TensorFlow?
pal, you can't restore variable name, weights is as far as you go. Also in tensorflow or in your code in general , the only option you got for restoring is saver . It also depends what kind of data you wanna restore if it's .npy file then np.load will do.

retraining last layer of inception-v3 significantly slowers the classification

In an attempt for transfer learning over inception-v3 with TF and PY3.5, I've tested two approaches:
1- retraining the last layer, as shown here: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/image_retraining
2- Apply linear SVM on top of inception-V3 bottlenecks as demonstrated here: https://www.kernix.com/blog/image-classification-with-a-pre-trained-deep-neural-network_p11
Expectedly, they should've had a similar runtime for classification phase, since the critical part - the bottlenecks extraction - is identical. In practice though, the retrained network is about 8X slower when running classification.
My questions is whether anyone has an idea for the reason of this.
Some code snippets:
SVM on top (the faster):
def getTensors():
graph_def = tf.GraphDef()
f = open('classify_image_graph_def.pb', 'rb')
graph_def.ParseFromString(f.read())
tensorBottleneck, tensorsResizedImage = tf.import_graph_def(graph_def, name='', return_elements=['pool_3/_reshape:0', 'Mul:0'])
return tensorBottleneck, tensorsResizedImage
def calc_bottlenecks(imgFile, tensorBottleneck, tensorsResizedImage):
""" - read, decode and resize to get <resizedImage> - """
bottleneckValues = sess.run(tensorBottleneck, {tensorsResizedImage : resizedImage})
return np.squeeze(bottleneckValues)
This takes about 0.5 sec on my (Windows) laptop while the SVM part takes no time.
Retraining last layer - (this is harder to summarize since longer code)
def loadGraph(pbFile):
with tf.gfile.FastGFile(pbFile, 'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name='')
with tf.Session() as sess:
softmaxTensor = sess.graph.get_tensor_by_name('final_result:0')
def labelImage(imageFile, softmaxTensor):
with tf.Session() as sess:
input_layer_name = 'DecodeJpeg/contents:0'
predictions, = sess.run(softmax_tensor, {input_layer_name: image_data})
'pbFile' is the file saved be the retrainer, which supposed to have identical topology and weights excluding the classification layer, as 'classify_image_graph_def.pb'. This takes about 4sec to run (on my same laptop, without the loading).
Any idea for the performance gap?
Thanks!
Solved. The problem was in creating a new tf.Session() for every image. Storing the session when reading graph and using it made runtime back to expected.
def loadGraph(pbFile):
...
with tf.Session() as sess:
softmaxTensor = sess.graph.get_tensor_by_name('final_result:0')
sessToStore = sess
return softmaxTensor, sessToStore
def labelImage(imageFile, softmaxTensor, sessToStore):
input_layer_name = 'DecodeJpeg/contents:0'
predictions, = sessToStore.run(softmax_tensor, {input_layer_name: image_data})

Resources