cannot get the same output as the pytorch model with openvino - openvino

I have a strange problem in trying to use OpenVino.
I have exported my pytorch model to onnx and then imported it to OpenVino using the following command:
python /opt/intel/openvino/deployment_tools/model_optimizer/mo.py --input_model ~/Downloads/unet2d.onnx --disable_resnet_optimization --disable_fusing --disable_gfusing --data_type=FP32
So for the test case, I have disabled the optimizations.
Now, using the sample python applications, I run inference using the model as follows:
from openvino.inference_engine import IENetwork, IECore
import numpy as np
model_xml = path.expanduser('model.xml')
model_bin = path.expanduser('model.bin')
ie = IECore()
net = IENetwork(model=model_xml, weights=model_bin)
input_blob = next(iter(net.inputs))
out_blob = next(iter(net.outputs))
net.batch_size = 1
exec_net = ie.load_network(network=net, device_name='CPU')
np.random.seed(0)
x = np.random.randn(1, 2, 256, 256) # expected input shape
res = exec_net.infer(inputs={input_blob: x})
res = res[out_blob]
The problem is that this seems to output something completely different from my onnx or the pytorch model.
Additionally, I realized that I do not even have to pass an input, so if I do something like:
x = None
res = exec_net.infer(inputs={input_blob: x})
This still returns me the same output! So it seems to suggest that somehow my input is getting ignored or something like that?

Could you try without --disable_resnet_optimization --disable_fusing --disable_gfusing
with leaving the optimizations in.

Related

Extract the features of last layer from the pytorch-fasterrcnn-resnet50-fpn

I have a image where i have to use the pytorch-fasterrcnn-resnet50-fpn to extract the features of the image. Below is the code that I am trying
import torchvision
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained = True)
### strip the last layer
feature_extractor = torch.nn.Sequential(*list(model_ft.children())[:-1])
inputs = feature_extractor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state
Here the type of image is PIL.JpegImagePlugin.JpegImageFile
The above code is not working for getting the features. Can anyone tell me how to solve this?

Saving a model that uses tensorflow.lookup.StaticVocabularyTable in .pb format in Tensorflow 2

I am building a model that accepts as input a 2d array of string tokens, then uses a lookup table to get the assigned indices of the input tokens in the vocabulary. The model then uses those indices to compute an embedded representation of the input tokens by fetching associated token embeddings and adding them together. The compounded embedding is then compared agains another matrix using a nearest-neighbors lookup and then the indices of the top-k most similar entries are returned.
The model is saved in .pb format and is then used in a container running the TensorFlow Serving image for inference.
At the moment I have something that works just fine in TensorFlow 1.15, however I am trying to migrate my code to TensorFlow 2.4 and can't find a way to make it work.
Here is a slightly modified version of the code I am working with at the moment in TensorFlow 1.15
import tensorflow as tf
graph = tf.get_default_graph()
session = tf.Session()
tf.global_variables_initializer()
vocabulary = ['one', 'two', 'three', 'four', 'five', 'six']
embedding_dimension = 512
n_tokens = len(vocabulary)
token_embeddings = np.random.random((n_tokens, embedding_dimension))
matrix = np.random.random((100, embedding_dimension))
lookup_table_initializer = tf.lookup.KeyValueTensorInitializer(vocabulary, np.arange(n_tokens))
lookup_table = tf.lookup.StaticVocabularyTable(lookup_table_initializer, num_oov_buckets=1)
token_embeddings_with_oov_token = np.vstack([token_embeddings, np.zeros(embedding_dimension)])
token_embeddings_tensor = tf.convert_to_tensor(token_embeddings_with_oov_token, dtype=tf.float32)
matrix = tf.convert_to_tensor(matrix, dtype=tf.float32)
model_input = tf.placeholder(tf.string, [None, None], name="input")
input_tokens_indices = lookup_table.lookup(model_input)
input_token_indices_one_hot = tf.one_hot(input_tokens_indices, tf.dtypes.cast(value, dtype=np.int32)(lookup_table.size()))
encoded_text = tf.math.reduce_sum(input_token_indices_one_hot, axis=1, keepdims=True)
embedded_text = tf.linalg.matmul(encoded_text, token_embeddings_tensor)
embedded_text_pooled = tf.math.reduce_sum(embedded_text, axis=1)
embedded_text_normed = tf.divide(embedded_text_pooled, tf.norm(embedded_text_pooled, ord=2))
neighbors = tf.linalg.matmul(embedded_text_normed, product_embeddings_tensor, transpose_b=True, name="output")
tf.saved_model.simple_save(
session,
"model.pb",
inputs={"input": model_input},
outputs={"output": neighbors},
legacy_init_op=tf.tables_initializer(),
)
The issue that I am facing is when converting the above code to TensorFlow 2. First of all, the tf.placeholder is no more and I have read on other posts suggestions to replace that with tf.keras.layers.Input((), dtype=tf.dtypes.string), however then I get an error when I try to carry out the lookup_table.lookup() step, as apparently I cannot pass a symbolic tensor to that function. As a result I am stuck and do not know which way to proceed to make my model compatible with tf2 and after hours searching online for solutions I can't seem to find something that works.

Cannot export PyTorch model to ONNX

I am trying to convert a pre-trained torch model to ONNX, but recive the following error:
RuntimeError: step!=1 is currently not supported
I'm trying this on a pre-trained colorization model: https://github.com/richzhang/colorization
Here is the code I ran in Google Colab:
!git clone https://github.com/richzhang/colorization.git
cd colorization/
import colorizers
model = colorizer_siggraph17 = colorizers.siggraph17(pretrained=True).eval()
input_names = [ "input" ]
output_names = [ "output" ]
dummy_input = torch.randn(1, 1, 256, 256, device='cpu')
torch.onnx.export(model, dummy_input, "test_converted_model.onnx", verbose=True,
input_names=input_names, output_names=output_names)
I appreciate any help :)
UPDATE 1: #Proko suggestion solved the ONNX export issue. Now I have a new possibly related problem when I try to convert the ONNX to TensorRT. I get the following error:
[TensorRT] ERROR: Network must have at least one output
Here is the code I used:
import torch
import pycuda.driver as cuda
import pycuda.autoinit
import tensorrt as trt
import onnx
TRT_LOGGER = trt.Logger()
def build_engine(onnx_file_path):
# initialize TensorRT engine and parse ONNX model
builder = trt.Builder(TRT_LOGGER)
builder.max_workspace_size = 1 << 25
builder.max_batch_size = 1
if builder.platform_has_fast_fp16:
builder.fp16_mode = True
network = builder.create_network()
parser = trt.OnnxParser(network, TRT_LOGGER)
# parse ONNX
with open(onnx_file_path, 'rb') as model:
print('Beginning ONNX file parsing')
parser.parse(model.read())
print('Completed parsing of ONNX file')
# generate TensorRT engine optimized for the target platform
print('Building an engine...')
engine = builder.build_cuda_engine(network)
context = engine.create_execution_context()
print("Completed creating Engine")
return engine, context
ONNX_FILE_PATH = 'siggraph17.onnx' # Exported using the code above
engine,_ = build_engine(ONNX_FILE_PATH)
I tried to force the build_engine function to use the output of the network by:
network.mark_output(network.get_layer(network.num_layers-1).get_output(0))
but it did not work.
I appropriate any help!
Like I have mentioned in a comment, this is because slicing in torch.onnx supports only step = 1 but there are 2-step slicing in the model:
self.model2(conv1_2[:,:,::2,::2])
Your only option as for now is to rewrite slicing to be some other ops. You can do it by using range and reshape to obtain proper indices. Consider the following function "step-less-arange" (I hope it is generic enough for anyone with similar problem):
def sla(x, step):
diff = x % step
x += (diff > 0)*(step - diff) # add length to be able to reshape properly
return torch.arange(x).reshape((-1, step))[:, 0]
usage:
>> sla(11, 3)
tensor([0, 3, 6, 9])
Now you can replace every slice like this:
conv2_2 = self.model2(conv1_2[:,:,self.sla(conv1_2.shape[2], 2),:][:,:,:, self.sla(conv1_2.shape[3], 2)])
NOTE: you should optimize it. Indices are calculated for every call so it might be wise to pre-compute it.
I have tested it with my fork of the repo and I was able to save the model:
https://github.com/prokotg/colorization
What works for me was to add the opset_version=11 on torch.onnx.export
First I had tried use opset_version=10, but the API suggest 11 so it works.
So your function should be:
torch.onnx.export(model, dummy_input, "test_converted_model.onnx", verbose=True,opset_version=11,
input_names=input_names, output_names=output_names)

Model parallelism in Keras

I am trying to implement model parallelism in Keras.
I am using Keras-2.2.4
Tensorflor-1.13.1
Rough structure of my code is :
import tensorflow as tf
import keras
def model_definition():
input0 = Input(shape = (None, None))
input1 = Input(shape = (None, None))
with tf.Session(config=tf.ConfigProto(allow_soft_placement=False, log_device_placement=True)):
model = get_some_CNN_model()
with tf.device(tf.DeviceSpec(device_type="GPU", device_index=0)):
op0 = model(input0)
with tf.device(tf.DeviceSpec(device_type="GPU", device_index=1)):
op1 = model(input1)
with tf.device(tf.DeviceSpec(device_type="CPU", device_index=0)):
concatenated_ops = concatenate([op0, op1],axis=-1, name = 'check_conc1')
mixmodel = Model(inputs=[input0, input1], outputs = concatenated_ops)
return mixmodel
mymodel = model_definition()
mymodel.fit_generator()
Expected result: While training, computations for op0 and op1 should be done on gpu0 and gpu1, respectively.
Problem 1: When I have 2 gpus available, the training works fine. nvidia-smi shows that both gpus are being used. Though I am not sure if both gpus are doing their intended work. How to confirm that?
Because, even I set log_device_placement as True, I don't see any task allocated to gpu 1
Problem 2: When I run this code on a machine with 1 GPU available, still it runs fine. It is expected show an error because GPU 1 is not available.
The example shown here works fine as expected. It doesn't show the problem 2, i.e. on single GPU, it raises an error.
So I think some manipulation is happening inside keras.
I have also tried using import tensorflow.python.keras instead of import keras, in case it is causing any conflict.
However, the both problems persist.
Would appreciate any clue about this issue. Thank you.

Keras 1.0: getting intermediate layer output

I am currently trying to visualize the output of an intermediate layer in Keras 1.0 (which I could do with Keras 0.3) but it does not work anymore.
x = model.input
y = model.layers[3].output
f = theano.function([x], y)
But I get the following error:
MissingInputError: ("An input of the graph, used to compute DimShuffle{x,x,x,x}(keras_learning_phase), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.", keras_learning_phase)
Prior to Keras 1.0, with my graph model, I could just do:
x = graph.inputs['input'].input
y = graph.nodes[layer].get_output(train=False)
f = theano.function([x], y, allow_input_downcast=True)
So I suspect it to come from the "train=False" parameter which I don't know how to set in the new version.
Thank you for your help
Try:
In the import statements first give
from keras import backend as K
from theano import function
then
f = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[3].output])
# output in test mode = 0
layer_output = get_3rd_layer_output([X_test, 0])[0]
# output in train mode = 1
layer_output = get_3rd_layer_output([X_train, 1])[0]
This was just answered by François Chollet on github:
Your model apparently has a different behavior in training and test mode, and so needs to know what mode it should be using.
Use
iterate = K.function([input_img, K.learning_phase()], [loss, grads])
and pass 1 or 0 as value for the learning phase, based on whether you want the model in training mode or test mode.
https://github.com/fchollet/keras/issues/2417

Resources