Tflite (TF2) trained model running but not detecting in iOS app - python-3.x

I have trained a pre-model (ssd_mobilenet_v2_fpnlite_640x640) using TF2 Object Detection API then exported to an intermediate SavedModel to then convert it in TFlite model, using the following tutorials:
TF2 Object Detection API, Running TF2 Detection API Models on mobile, Converter Python API guide, and, Edge TF Lite iOS tutorial.
After many work hours I managed to make my model to predict in Python environment and run in the pre-made iOS app from TF lite.
However, after trying many ways of exporting and converting the model I cannot make the model to detect the objects I trained it for to detect.
The following is the instruction for training the model using TF2 API:
python3 model_main_tf2.py \
--pipeline_config_path={pipeline_path}\
--model_dir={output_model_dir} \
--alsologtostderr
This is the instructions for exporting the SavedModel using TF2 API:
python export_tflite_graph_tf2.py \
--pipeline_config_path {pipeline_path} \
--trained_checkpoint_dir {output_model_dir} \
--output_directory {exported_models_dir}
And the following the code to convert the model to tflite from Python API:
converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
tflite_model = converter.convert()
I have also tried some other alternatives to convert in TF1 like:
converter = tf.compat.v1.lite.TFLiteConverter.from_saved_model(export_dir)
converter.inference_type = tf.compat.v1.lite.constants.QUANTIZED_UINT8
input_arrays = converter.get_input_arrays()
converter.quantized_input_stats = {input_arrays[0] : (0., 1.)} # mean_value, std_dev
tflite_model = converter.convert()
and with command line:
tflite_convert \
--saved_model_dir={saved_model} \
--output_file={output_dir} \
--output_format=TFLITE \
--input_shapes=1,640,640,3 \
--input_arrays='normalized_input_image_tensor' \
--output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' \
--inference_type=QUANTIZED_UINT8 \
--mean_values=128 \
--std_dev_values=127 \
--change_concat_input_ranges=false \
--allow_custom_ops
the result is a 500bytes file. This tflite model looks as follows (in Neutron):
In the IOS app I adjusted the code this way:
// MARK: Model parameters
let batchSize = 1
let inputChannels = 3
let inputWidth = 640
let inputHeight = 640
// image mean and std for floating model, should be consistent with parameters used in model training
let imageMean: Float = 128
let imageStd: Float = 127
I have also tried with some other SSD Mobilenet models unsuccesfully. I've been stuck for several days already, I'd appreciate your help.

Install tf-nighly and convert saved_model to tflite file
https://pypi.org/project/tf-nightly/

Related

Clarifications on training job parameters with Tensorflow

Im using the new Tensorflow object detection API.
I need to replicate training parameters used on a paper but Im a bit confused.
In the paper is stated
When training neural network models, their base confguration is similar to that used to
train on the COCO 2017 dataset. For the unambiguous comparison of the selected models, the total number of
training steps was set to 100 equal to 100′000 iterations of learning.
Inside model_main_tf2.py, which is the script used to start the training, I can read the following:
"""Creates and runs TF2 object detection models.
For local training/evaluation run:
PIPELINE_CONFIG_PATH=path/to/pipeline.config
MODEL_DIR=/tmp/model_outputs
NUM_TRAIN_STEPS=10000
SAMPLE_1_OF_N_EVAL_EXAMPLES=1
python model_main_tf2.py -- \
--model_dir=$MODEL_DIR --num_train_steps=$NUM_TRAIN_STEPS \
--sample_1_of_n_eval_examples=$SAMPLE_1_OF_N_EVAL_EXAMPLES \
--pipeline_config_path=$PIPELINE_CONFIG_PATH \
--alsologtostderr
"""
Also, you can specify the num_steps and total_steps parameters in the pipeline.config file (used by the training script):
train_config: {
batch_size: 1
sync_replicas: true
startup_delay_steps: 0
replicas_to_aggregate: 8
num_steps: 50000
optimizer {
momentum_optimizer: {
learning_rate: {
cosine_decay_learning_rate {
learning_rate_base: .16
total_steps: 50000
warmup_learning_rate: 0
warmup_steps: 2500
}
}
momentum_optimizer_value: 0.9
}
use_moving_average: false
}
So, what Im not understanding is how should I map what is written in the paper with tensorflow parameters.
What is the num steps and total_steps inside the pipeline.config file?
What is the NUM_TRAIN_STEPS argument instead?
Does it overwrite config file steps or its a completely different thing?
If more details are needed feel free to ask.

Integrate tflite model with tensorflow object-detection example code

I have a tflite model which I procured by converting my TFmodel( MobileNet Single Shot Detector (v2) ).
I have successfully converted my model into tflite format using the code below.
!tflite_convert \
--input_shape=1,300,300,3 \
--input_arrays=normalized_input_image_tensor \
--output_arrays=TFLite_Detection_PostProcess,TFLite_Detection_PostProcess:1,TFLite_Detection_PostProcess:2,TFLite_Detection_PostProcess:3 \
--allow_custom_ops \
--graph_def_file=/content/models/research/fine_tuned_model/tflite/tflite_graph.pb \
--output_file="/content/models/research/fine_tuned_model/final_model.tflite"
And have tried to integrate it into the object-detection code which is provided by the tensorflow team.But the output is not visible.
The Steps taken from my end for integrating were as follows:
1.Commenting the below line from build.gradle(app)
apply from:'download_model.gradle'
I added my tflite model in the assets folder and modified the label.txt with my own labels.
In the Detector Activity,
private static final boolean TF_OD_API_IS_QUANTIZED = true;
I have set the above boolean to false
and reduced the probability to 0.2
private static final float MINIMUM_CONFIDENCE_TF_OD_API = 0.5f;
But it didn't worked.
The github link to the object-detection code :-
https://github.com/tensorflow/examples/blob/master/lite/examples/object_detection/android
Also ,please also let know how to test the working of the tflite model using the test images.
These are the values after debugging the model
[[[ 0.15021165 0.45557776 0.99523586 1.009417 ]
[ 0.4825344 0.18693507 0.9941584 0.83610606]
[ 0.36018616 0.612343 1.0781565 1.1020089 ]
[ 0.47380492 0.03632754 0.99250865 0.5964786 ]
[ 0.15898478 0.12117874 0.94728076 0.8854655 ]
[ 0.44774154 0.41910237 0.9966481 0.9704595 ]
[ 0.06241751 -0.02005028 0.93670964 0.3915068 ]
[ 0.1917564 0.00806974 1.0165613 0.5287838 ]
[ 0.20279509 0.738887 0.95690674 1.0022873 ]
[ 0.7434618 0.07342905 0.9969055 0.6412263 ]]]
First train your model . Better use a trained model . After training the model get the tflite graphs and convert them to tflite model using Bazel(Better use Ubuntu). Then get the Tensorflow/examples and open the object detection android folder on your Android Studio .
Remove the metadata code from the tflite interpreter class , since it demands it and there is no official way declared by the officials to make it happen . And then you can make it work .

Can SAC be used instead PPO in Cartpole example?

I'm studying AzureML RL with example codes.
I could run cartpole example (cartpole_ci.ipynb) which trains
the PPO model on compute instance.
I tried SAC instead of PPO by changing training_algorithm = "PPO" to training_algorithm = "SAC"
but it failed with the message below.
ray.rllib.utils.error.UnsupportedSpaceException: Action space Discrete(2) is not supported for SAC.
Has someone tried SAC algorithm on AzureML RL and did it work?
AzureML RL does support SAC Discrete Actions but not parametric and I have confirmed it in the doc - https://docs.ray.io/en/latest/rllib-algorithms.html#feature-compatibility-matrix
Are you following the code sample?
from azureml.contrib.train.rl import ReinforcementLearningEstimator, Ray
training_algorithm = "PPO" rl_environment = "CartPole-v0"
script_params = {
# Training algorithm
"--run": training_algorithm,
# Training environment
"--env": rl_environment,
# Algorithm-specific parameters
"--config": '\'{"num_gpus": 0, "num_workers": 1}\'',
# Stop conditions
"--stop": '\'{"episode_reward_mean": 200, "time_total_s": 300}\'',
# Frequency of taking checkpoints
"--checkpoint-freq": 2,
# If a checkpoint should be taken at the end - optional argument with no value
"--checkpoint-at-end": "",
# Log directory
"--local-dir": './logs' }
training_estimator = ReinforcementLearningEstimator(
# Location of source files
source_directory='files',
# Python script file
entry_script='cartpole_training.py',
# A dictionary of arguments to pass to the training script specified in ``entry_script``
script_params=script_params,
# The Azure Machine Learning compute target set up for Ray head nodes
compute_target=compute_target,
# Reinforcement learning framework. Currently must be Ray.
rl_framework=Ray() )

Input 'input_image' of layer '63' not found in any of the outputs of the preceeding layers

Update #1 (original question and details below):
As per the suggestion of #MatthijsHollemans below I've tried to run this by removing dynamic_axes from the initial create_onnx step below. This removed both:
Description of image feature 'input_image' has missing or non-positive width 0.
and
Input 'input_image' of layer '63' not found in any of the outputs of the preceeding layers.
Unfortunately this opens up two sub-questions:
I still want to have a functional ONNX model. Is there a more appropriate way to make H and W dynamic? Or should I be saving two versions of the ONNX model, one without dynamic_axes for the CoreML conversion, and one with for use as a valid ONNX model?
Although this solves the compilation error in xcode (specified below) it introduces the following runtime issues:
Finalizing CVPixelBuffer 0x282f4c5a0 while lock count is 1.
[espresso] [Espresso::handle_ex_plan] exception=Invalid X-dimension 1/480 status=-7
[coreml] Error binding image input buffer input_image: -7
[coreml] Failure in bindInputsAndOutputs.
I am calling this the same way I was calling the fixed size model, which does still work fine. The image dimensions are 640 x 480.
As specified below the model should accept any image between 64x64 and higher.
For flexible shape models, do I need to provide an input differently in xcode?
Original Question (parts still relevant)
I have been slowly working on converting a style transfer model from pytorch > onnx > coreml. One of the issues that has been a struggle is flexible/dynamic input + output shape.
This method (besides i/o renaming) has worked well on iOS 12 & 13 when using a static input shape.
I am using the following code to do the onnx > coreml conversion:
def create_coreml(name):
mlmodel = convert(
model="onnx/" + name + ".onnx",
preprocessing_args={'is_bgr': True},
deprocessing_args={'is_bgr': True},
image_input_names=['input_image'],
image_output_names=['stylized_image'],
minimum_ios_deployment_target='13'
)
spec = mlmodel.get_spec()
img_size_ranges = flexible_shape_utils.NeuralNetworkImageSizeRange()
img_size_ranges.add_height_range((64, -1))
img_size_ranges.add_width_range((64, -1))
flexible_shape_utils.update_image_size_range(
spec,
feature_name='input_image',
size_range=img_size_ranges)
flexible_shape_utils.update_image_size_range(
spec,
feature_name='stylized_image',
size_range=img_size_ranges)
mlmodel = coremltools.models.MLModel(spec)
mlmodel.save("mlmodel/" + name + ".mlmodel")
Although the conversion 'succeeds' there are a couple of warnings (spaces added for readability):
Translation to CoreML spec completed. Now compiling the CoreML model.
/usr/local/lib/python3.7/site-packages/coremltools/models/model.py:111:
RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was:
Error compiling model:
"Error reading protobuf spec. validator error: Description of image feature 'input_image' has missing or non-positive width 0.".
RuntimeWarning)
Model Compilation done.
/usr/local/lib/python3.7/site-packages/coremltools/models/model.py:111:
RuntimeWarning: You will not be able to run predict() on this Core ML model. Underlying exception message was:
Error compiling model:
"compiler error: Input 'input_image' of layer '63' not found in any of the outputs of the preceeding layers.
".
RuntimeWarning)
If I ignore these warnings and try to compile the model for latest targets (13.0) I get the following error in xcode:
coremlc: Error: compiler error: Input 'input_image' of layer '63' not found in any of the outputs of the preceeding layers.
Here is what the problematic area appears to look like in netron:
My main question is how can I get these two warnings out of the way?
Happy to provide any other details.
Thanks for any advice!
Below is my pytorch > onnx conversion:
def create_onnx(name):
prior = torch.load("pth/" + name + ".pth")
model = transformer.TransformerNetwork()
model.load_state_dict(prior)
dummy_input = torch.zeros(1, 3, 64, 64) # I wasn't sure what I would set the H W to here?
torch.onnx.export(model, dummy_input, "onnx/" + name + ".onnx",
verbose=True,
opset_version=10,
input_names=["input_image"], # These are being renamed from garbled originals.
output_names=["stylized_image"], # ^
dynamic_axes={'input_image':
{2: 'height', 3: 'width'},
'stylized_image':
{2: 'height', 3: 'width'}}
)
onnx.save_model(original_model, "onnx/" + name + ".onnx")

SVM Model does not support probability estimation?

I am doing some classification task with Support Vector Machines (SVM).
I am using libSVM (with Matlab support) to predict probability estimates matrix. However, the libSVM displays message that;
Model does not support probabiliy estimates
Below is my sample code;
(train_label contains labels for training data and test_label contains label for test data)
model = svmtrain(train_label, train_data, '-t 2 -g .01 -c 0.7 -b 1);
[y,accuracy,prob_estimates]=svmpredict(test_label,test_data,model,'-b 1');
Can someone tell me if there is something wrong with the way I am doing it? Any help/suggestion will be appreciated.
Don't know about the Matlab implementation, but usually you have to set this option:
-b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
I am using libsvm in the same way without any problem.
In your code only a ' is missing in the following line
model = svmtrain(train_label, train_data, '-t 2 -g .01 -c 0.7 -b 1);
It should be
model = svmtrain(train_label, train_data, '-t 2 -g .01 -c 0.7 -b 1');
I had the same problem, model hasn't got ProbA and ProbB in it.
Before it was like this and giving error:
linear_model = svmtrain(trainClass, trainData, ['-t 0', cmd]);
Then I changed it to this, error dissappared:) - removed cmd and put exact values
linear_model = svmtrain(trainClass, trainData, ['-t 0 -c 1 -g 0.125 -b 1']);
if still gives error try to change c and g parameters.
Hope this helps.
It is because your model does not support probabiliy estimates.
You should use '-b 1' option both at training and testing process.
See also: https://stackoverflow.com/a/43509667/7893127
You may just train the model with default parameter.
Try to use '-b 1' when you are training and testing programe.
C:\setup\python36\Lib\site-packages\svm.py default value of self.probability is 0. You can set it 1.

Resources