Shapes are not consistent - openvino

I run the mo (model optimizer in openvino toolkit) as below:
mo --input_model ../models/middlebury_d400.pb --input_shape [1,352,704,6]
And get the error messages as following:
Model Optimizer arguments:
Common parameters:
- Path to the Input Model: /home/paul/tf2.x/hitnet-test/openvino/../models/middlebury_d400.pb
- Path for generated IR: /home/paul/tf2.x/hitnet-test/openvino/.
- IR output name: middlebury_d400
- Log level: ERROR
- Batch: Not specified, inherited from the model
- Input layers: Not specified, inherited from the model
- Output layers: Not specified, inherited from the model
- Input shapes: [1,352,704,6]
- Source layout: Not specified
- Target layout: Not specified
- Layout: Not specified
- Mean values: Not specified
- Scale values: Not specified
- Scale factor: Not specified
- Precision of IR: FP32
- Enable fusing: True
- User transformations: Not specified
- Reverse input channels: False
- Enable IR generation for fixed input shape: False
- Use the transformations config file: None
Advanced parameters:
- Force the usage of legacy Frontend of Model Optimizer for model conversion into IR: False
- Force the usage of new Frontend of Model Optimizer for model conversion into IR: False
TensorFlow specific parameters:
- Input model in text protobuf format: False
- Path to model dump for TensorBoard: None
- List of shared libraries with TensorFlow custom layers implementation: None
- Update the configuration file with input/output node names: None
- Use configuration file used to generate the model with Object Detection API: None
- Use the config file: None
OpenVINO runtime found in: /opt/intel/openvino_2022/python/python3.8/openvino
OpenVINO runtime version: 2022.1.0-7019-cdb9bec7210-releases/2022/1
Model Optimizer version: 2022.1.0-7019-cdb9bec7210-releases/2022/1
[ WARNING ] Changing Const node '6284' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6286' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6288' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6292' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6290' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6298' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6278' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6294' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6280' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6296' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6282' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node '6300' data type from int64 to <class 'numpy.float32'> for Mul operation
[ WARNING ] Changing Const node 'shared/refinement_l2/Slice/where_max_ends_is_needed_input_port_0/value' data type from int64 to <class 'numpy.int32'> for Equal operation
[ ERROR ] Check 'data_pshape[i].compatible(indices_pshape[i])' failed at core/shape_inference/include/gather_shape_inference.hpp:80:
While validating node 'v0::Gather Gather_4901 (level5/level_init/Reshape_2/Transpose[0]:f32{1,96,192,400}, level5/level_init/GatherV2_1/Cast_1[0]:i32{1,1,96,192}, level5/level_init/GatherV2_1/axis[0]:i64{}) -> ()' with friendly_name 'Gather_4901':
Shapes {1,96,192,400} and {1,1,96,192} are not consistent. data and indices must have equal or intersecting sizes until batch_dims
[ ERROR ] offline transformations step has failed.
for the middlebury_d400.pb you may wget it from:
wget -P . -N https://storage.googleapis.com/tensorflow-graphics/models/hitnet/de
fault_models/middlebury_d400.pb
Please advise how to fix the above error. Thank you.

Run Model Optimizer with additional parameter, --disable_nhwc_to_nchw that disables the default translation from NHWC to NCHW during the TensorFlow model conversion.

Related

MLflow webserver returns 400 status, "Incompatible input types for column X. Can not safely convert float64 to <U0."

I am implementing an anomaly detection web service using MLflow and sklearn.pipeline.Pipeline(). The aim of the model is to detect web crawlers using server log and response_length column is one of my features. After serving model, for testing the web service I send below request that contains the 20 first columns of the train data.
$ curl --location --request POST '127.0.0.1:8000/invocations'
--header 'Content-Type: text/csv' \
--data-binary 'datasets/test.csv'
But response of the web server has status code 400 (BAD REQUEST) and this JSON body:
{
"error_code": "BAD_REQUEST",
"message": "Incompatible input types for column response_length. Can not safely convert float64 to <U0."
}
Here is the model compilation MLflow Tracking component log:
[Pipeline] ......... (step 1 of 3) Processing transform, total=11.8min
[Pipeline] ............... (step 2 of 3) Processing pca, total= 4.8s
[Pipeline] ........ (step 3 of 3) Processing rule_based, total= 0.0s
2021/07/16 04:55:12 WARNING mlflow.sklearn: Training metrics will not be recorded because training labels were not specified. To automatically record training metrics, provide training labels as inputs to the model training function.
2021/07/16 04:55:12 WARNING mlflow.utils.autologging_utils: MLflow autologging encountered a warning: "/home/matin/workspace/Rahnema College/venv/lib/python3.8/site-packages/mlflow/models/signature.py:129: UserWarning: Hint: Inferred schema contains integer column(s). Integer columns in Python cannot represent missing values. If your input data contains missing values at inference time, it will be encoded as floats and will cause a schema enforcement error. The best way to avoid this problem is to infer the model schema based on a realistic data sample (training dataset) that includes missing values. Alternatively, you can declare integer columns as doubles (float64) whenever these columns may have missing values. See `Handling Integers With Missing Values <https://www.mlflow.org/docs/latest/models.html#handling-integers-with-missing-values>`_ for more details."
Logged data and model in run: 8843336f5c31482c9e246669944b1370
---------- logged params ----------
{'memory': 'None',
'pca': 'PCAEstimator()',
'rule_based': 'RuleBasedEstimator()',
'steps': "[('transform', <log_transformer.LogTransformer object at "
"0x7f05a8b95760>), ('pca', PCAEstimator()), ('rule_based', "
'RuleBasedEstimator())]',
'transform': '<log_transformer.LogTransformer object at 0x7f05a8b95760>',
'verbose': 'True'}
---------- logged metrics ----------
{}
---------- logged tags ----------
{'estimator_class': 'sklearn.pipeline.Pipeline', 'estimator_name': 'Pipeline'}
---------- logged artifacts ----------
['model/MLmodel',
'model/conda.yaml',
'model/model.pkl',
'model/requirements.txt']
Could anyone tell me exactly how I can fix this model serve problem?
The problem caused by mlflow.utils.autologging_utils WARNING.
When the model is created, data input signature is saved on the MLmodel file with some.
You should change response_length signature input type from string to double by replacing
{"name": "response_length", "type": "double"}
instead of
{"name": "response_length", "type": "string"}
so it doesn't need to be converted. After serving the model with edited MLmodel file, the web server worked as expected.

Change trialId in Google AI Platform hyperparameter tuning

I'm trying to follow this tutorial on hyperparameter tuning on AI Platform: https://cloud.google.com/blog/products/gcp/hyperparameter-tuning-on-google-cloud-platform-is-now-faster-and-smarter.
My configuration yaml file looks like this:
trainingInput:
hyperparameters:
goal: MINIMIZE
hyperparameterMetricTag: loss
maxTrials: 4
maxParallelTrials: 2
params:
- parameterName: learning_rate
type: DISCRETE
discreteValues:
- 0.0005
- 0.001
- 0.0015
- 0.002
The expected output:
"completedTrialCount": "4",
"trials": [
{
"trialId": "3",
"hyperparameters": {
"learning_rate": "2e-03"
},
"finalMetric": {
"trainingStep": "123456",
"objectiveValue": 0.123456
},
},
Is there any way to customize the trialId instead the defaults numeric values (e.g. 1,2,3,4...)?
It is not possible to customize the trialId as it is dependent on the parameter maxTrials in your hyperparameter tuning config.
maxTrials only accepts integers, so the assigned value to trialId will be a range from 1 to your defined maxTrials.
Also as mentioned in the example in your post where maxTrials: 40 is set and it yields a json that shows trialId: 35 which is within the range of maxTrials.
This indicates that 40 trials have been completed, and the best so far
is trial 35, which achieved an objective of 1.079 with the
hyperparameter values of nembeds=18 and nnsize=32.
Example output:

Use *.pth model in C++

I want to run inference in C++ using a yolo3 model I trained with pytorch. I am unable to make the conversions using tracing and scripting provided by pytorch. I have this error during conversion
First diverging operator:
Node diff:
- %2 : __torch__.torch.nn.modules.container.ModuleList = prim::GetAttr[name="module_list"](%self.1)
+ %2 : __torch__.torch.nn.modules.container.___torch_mangle_139.ModuleList = prim::GetAttr[name="module_list"](%self.1)
? ++++++++++++++++++++
ERROR: Tensor-valued Constant nodes differed in value across invocations. This often indicates that the tracer has encountered untraceable code.
Node:
%358 : Tensor = prim::Constant[value=<Tensor>](), scope: __module.module_list.16.yolo_16

Input 'input_image' of layer '63' not found in any of the outputs of the preceeding layers

Update #1 (original question and details below):
As per the suggestion of #MatthijsHollemans below I've tried to run this by removing dynamic_axes from the initial create_onnx step below. This removed both:
Description of image feature 'input_image' has missing or non-positive width 0.
and
Input 'input_image' of layer '63' not found in any of the outputs of the preceeding layers.
Unfortunately this opens up two sub-questions:
I still want to have a functional ONNX model. Is there a more appropriate way to make H and W dynamic? Or should I be saving two versions of the ONNX model, one without dynamic_axes for the CoreML conversion, and one with for use as a valid ONNX model?
Although this solves the compilation error in xcode (specified below) it introduces the following runtime issues:
Finalizing CVPixelBuffer 0x282f4c5a0 while lock count is 1.
[espresso] [Espresso::handle_ex_plan] exception=Invalid X-dimension 1/480 status=-7
[coreml] Error binding image input buffer input_image: -7
[coreml] Failure in bindInputsAndOutputs.
I am calling this the same way I was calling the fixed size model, which does still work fine. The image dimensions are 640 x 480.
As specified below the model should accept any image between 64x64 and higher.
For flexible shape models, do I need to provide an input differently in xcode?
Original Question (parts still relevant)
I have been slowly working on converting a style transfer model from pytorch > onnx > coreml. One of the issues that has been a struggle is flexible/dynamic input + output shape.
This method (besides i/o renaming) has worked well on iOS 12 & 13 when using a static input shape.
I am using the following code to do the onnx > coreml conversion:
def create_coreml(name):
mlmodel = convert(
model="onnx/" + name + ".onnx",
preprocessing_args={'is_bgr': True},
deprocessing_args={'is_bgr': True},
image_input_names=['input_image'],
image_output_names=['stylized_image'],
minimum_ios_deployment_target='13'
)
spec = mlmodel.get_spec()
img_size_ranges = flexible_shape_utils.NeuralNetworkImageSizeRange()
img_size_ranges.add_height_range((64, -1))
img_size_ranges.add_width_range((64, -1))
flexible_shape_utils.update_image_size_range(
spec,
feature_name='input_image',
size_range=img_size_ranges)
flexible_shape_utils.update_image_size_range(
spec,
feature_name='stylized_image',
size_range=img_size_ranges)
mlmodel = coremltools.models.MLModel(spec)
mlmodel.save("mlmodel/" + name + ".mlmodel")
Although the conversion 'succeeds' there are a couple of warnings (spaces added for readability):
Translation to CoreML spec completed. Now compiling the CoreML model.
/usr/local/lib/python3.7/site-packages/coremltools/models/model.py:111:
RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was:
Error compiling model:
"Error reading protobuf spec. validator error: Description of image feature 'input_image' has missing or non-positive width 0.".
RuntimeWarning)
Model Compilation done.
/usr/local/lib/python3.7/site-packages/coremltools/models/model.py:111:
RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was:
Error compiling model:
"compiler error: Input 'input_image' of layer '63' not found in any of the outputs of the preceeding layers.
".
RuntimeWarning)
If I ignore these warnings and try to compile the model for latest targets (13.0) I get the following error in xcode:
coremlc: Error: compiler error: Input 'input_image' of layer '63' not found in any of the outputs of the preceeding layers.
Here is what the problematic area appears to look like in netron:
My main question is how can I get these two warnings out of the way?
Happy to provide any other details.
Thanks for any advice!
Below is my pytorch > onnx conversion:
def create_onnx(name):
prior = torch.load("pth/" + name + ".pth")
model = transformer.TransformerNetwork()
model.load_state_dict(prior)
dummy_input = torch.zeros(1, 3, 64, 64) # I wasn't sure what I would set the H W to here?
torch.onnx.export(model, dummy_input, "onnx/" + name + ".onnx",
verbose=True,
opset_version=10,
input_names=["input_image"], # These are being renamed from garbled originals.
output_names=["stylized_image"], # ^
dynamic_axes={'input_image':
{2: 'height', 3: 'width'},
'stylized_image':
{2: 'height', 3: 'width'}}
)
onnx.save_model(original_model, "onnx/" + name + ".onnx")

How to generate tflite from saved model?

I want to create an object-detection app based on a retrained ssd_mobilenet model I've retrained like the guy on youtube.
I chose the model ssd_mobilenet_v2_coco from the Tensorflow Model Zoo. After the retraining process I've got the model with the following structure:
- saved_model
- variables (empty folder)
- saved_model.pb
- checkpoint
- frozen_inverence_graph.pb
- model.ckpt.data-00000-of-00001
- model.ckpt.index
- model.ckpt.meta
- pipeline.config
In the same folder, I have the python script with the following code:
import tensorflow as tf
converter = tf.lite.TFLiteConverter.from_saved_model("saved_model", input_shapes={"image_tensor":[1,300,300,3]})
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)
After running this code, I got the following error:
...
2019-05-24 18:46:59.811289: I tensorflow/lite/toco/import_tensorflow.cc:1324] Converting unsupported operation: TensorArrayGatherV3
2019-05-24 18:46:59.811864: I tensorflow/lite/toco/import_tensorflow.cc:1373] Unable to determine output type for op: TensorArrayGatherV3
2019-05-24 18:46:59.908207: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before Removing unused ops: 1792 operators, 3033 arrays (0 quantized)
2019-05-24 18:47:00.089034: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] After Removing unused ops pass 1: 1771 operators, 2979 arrays (0 quantized)
2019-05-24 18:47:00.314681: I tensorflow/lite/toco/graph_transformations/graph_transformations.cc:39] Before general graph transformations: 1771 operators, 2979 arrays (0 quantized)
2019-05-24 18:47:00.453570: F tensorflow/lite/toco/graph_transformations/resolve_constant_slice.cc:59] Check failed: dim_size >= 1 (0 vs. 1)
Is there any solution for the "Check failed: dim_size >= 1 (0 vs. 1)"?
Conversion of MobileNet SSD is a little different due to some Custom ops that are needed in the graph.
Take a look at this Medium post for the end-to-end process of training and exporting the model as a TFLite graph. For conversion, you would need to use the export_tflite_ssd_graph script.

Resources