The numpy array (ndarray) cannot be specified as an output layer - onnx

I tried to edit my Yolo-v7 model outputs and add a constant output tensor ( the number of bounding boxes). But when I run conversion to TF I got
WARNING: The numpy array (ndarray) cannot be specified as an output layer. Therefore, the tool outputs a sequentially numbered .npy binary file. .npy_file_path: tf_output/0.npy
Steps to reproduce
use onnx_graphsurgeon to modify a pyTorch model output like this
num_boxes_int = gs.Variable(name='num_boxes_int', dtype=np.int64)
num_boxes_out = gs.Variable(name='num_boxes_out', dtype=np.float32)
num_boxes_node = gs.Node(op="Shape",
inputs=[bbox_out],
outputs=[num_boxes_int],
attrs={
'start':1,'end':2})
graph.nodes.append(num_boxes_node)
num_boxes_cast_node = gs.Node(op="Cast",
inputs=[num_boxes_int],
outputs=[num_boxes_out],
attrs={'to':int(onnx.TensorProto.FLOAT)})
graph.nodes.append(num_boxes_cast_node)
...
graph.outputs = [num_boxes_out, ...]
convert to openvino format !mo --input_model {onnx_model_path} --input_shape [1,3,{input_height},{input_width}] --output_dir {openvino_dir}. Check in the netron.app, it seems fine:
convert to tensorflow
!openvino2tensorflow \
--model_path {openvino_dir}/{openvino_filename} \
--model_output_path {tf_output_dir} \
--weight_replacement_config {weight_replacement_config_path} \
--non_verbose \
--output_saved_model \
--output_no_quant_float32_tflite \
--output_float16_quant_tflite \
--output_full_integer_quant_tflite \
--output_dynamic_range_quant_tflite
got the error
WARNING: The numpy array (ndarray) cannot be specified as an output layer. Therefore, the tool outputs a sequentially numbered .npy binary file. .npy_file_path: tf_output/0.npy
check the generated tflite files in netron.app, it all looks normal but just doesn't have the num_detections as output
My question:
What is the correct way to output a scalar value? Thanks!!

Unfortunately, this is not supported with OpenVINO. However, you may refer to this Github repository for the script to convert the OpenVINO Intermediate Representation (IR) model to TensorFlow format.

Related

Cannot convert a symbolic Tensor (IteratorGetNext:1) to a numpy array

I was trying to implement a metric namely APLS(Avg. Path Length Similarity) metric.I needed to perform some operations upon the groundtruth and the predicted image and generate a graph before calculating the APLS. Upon passing the groundtruth and predicted image to the graph generating function, I get this error:
NotImplementedError: Cannot convert a symbolic Tensor (IteratorGetNext:1) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported.
Could someone guide me a resolution and or an alternate code for my graph generating function which presently is:
def ImageToGraph(imageArray):
image = np.array(imageArray).astype(bool)
skeleton = skeletonize(image)
ske = skeleton.astype(np.uint8)
return sknw.build_sknw(ske)
The error occurs in line 1 of the function which says a symbolic tensor could not be converted to a numpy array.
I tried using the Keras backend methods but there were issues when i pass those tensors to the skeleton which says that a the skeleton function accepts a 2D image but i provided 5 dimensions(batch size is 5) but my groundtruth was a 2D image. Below is the model i am trying to train.
model = sm.Unet(
'efficientnetb0',
classes=1,
input_shape=(256, 256, 5),
encoder_weights=None,
activation='sigmoid'
)
model.compile(optimizer=Nadam(lr=0.0002), loss=bce_dice_apls_loss, metrics=[dice_coef])
TF does not work with numpy arrays for graph building. You need to cast your stuff from numpy to tensors before using them. Instead of:
image = np.array(imageArray).astype(bool)
you could try
image = tf.constant(imageArray, dtype=tf.bool)
Note that all operations once in graph need to use tf compatible functions. If you want to use numpy you have to wrap the function in a tf.numpy_function call.

MultiOutput Classification with TensorFlow Extended (TFX)

I'm quite new to TFX (TensorFlow Extended), and have been going through the sample tutorial on the TensorFlow portal to understand a bit more to apply it to my dataset.
In my scenario, instead of predicting a single label, the problem at hand requires me to predict 2 outputs (category 1, category 2).
I've done this using pure TensorFlow Keras Functional API and that works fine, but then am now looking to see if that can be fitted into the TFX pipeline.
Where i get the error, is at the Trainer stage of the pipeline, and where it throws the error is in the _input_fn, and i suspect it's because i'm not correctly splitting out the given data into (features, labels) tensor pair in the pipeline.
Scenario:
Each row of the input data comes in the form of
[Col1, Col2, Col3, ClassificationA, ClassificationB]
ClassificationA and ClassificationB are the categorical labels which i'm trying to predict using the Keras Functional Model
The output layer of the keras functional model looks like below, where there's 2 outputs that is joined to a single dense layer (Note: _xf appended to the end is just to illustrate that i've encoded the classes to int representations)
output_1 = tf.keras.layers.Dense(
TargetA_Class, activation='sigmoid',
name = 'ClassificationA_xf')(dense)
output_2 = tf.keras.layers.Dense(
TargetB_Class, activation='sigmoid',
name = 'ClassificationB_xf')(dense)
model = tf.keras.Model(inputs = inputs,
outputs = [output_1, output_2])
In the trainer module file, i've imported the required packages at the start of the module file >
import tensorflow_transform as tft
from tfx.components.tuner.component import TunerFnResult
import tensorflow as tf
from typing import List, Text
from tfx.components.trainer.executor import TrainerFnArgs
from tfx.components.trainer.fn_args_utils import DataAccessor, FnArgs
from tfx_bsl.tfxio import dataset_options
The current input_fn in the trainer module file looks like the below (by following the tutorial)
def _input_fn(file_pattern: List[Text],
data_accessor: DataAccessor,
tf_transform_output: tft.TFTransformOutput,
batch_size: int = 200) -> tf.data.Dataset:
"""Helper function that Generates features and label dataset for tuning/training.
Args:
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
tf_transform_output: A TFTransformOutput.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
Returns:
A dataset that contains (features, indices) tuple where features is a
dictionary of Tensors, and indices is a single Tensor of label indices.
"""
return data_accessor.tf_dataset_factory(
file_pattern,
dataset_options.TensorFlowDatasetOptions(
batch_size=batch_size,
#label_key=[_transformed_name(x) for x in _CATEGORICAL_LABEL_KEYS]),
label_key=_transformed_name(_CATEGORICAL_LABEL_KEYS[0]), _transformed_name(_CATEGORICAL_LABEL_KEYS[1])),
tf_transform_output.transformed_metadata.schema)
When i run the trainer component the error that comes up is:
label_key=_transformed_name(_CATEGORICAL_LABEL_KEYS[0]),transformed_name(_CATEGORICAL_LABEL_KEYS1)),
^ SyntaxError: positional argument follows keyword argument
I've also tried label_key=[_transformed_name(x) for x in _CATEGORICAL_LABEL_KEYS]) which also gives an error.
However, if i just pass in a single label key, label_key=transformed_name(_CATEGORICAL_LABEL_KEYS[0]) then it works fine.
FYI - _CATEGORICAL_LABEL_KEYS is nothing but a list which contains the names of the 2 outputs i'm trying to predict (ClassificationA, ClassificationB).
transformed_name is nothing but a function to return an updated name/key for the transformed data:
def transformed_name(key):
return key + '_xf'
Question:
From what i can see, the label_key argument for dataset_options.TensorFlowDatasetOptions can only accept a single string/name of label, which means it may not be able to output the dataset with multi labels.
Is there a way which i can modify the _input_fn so that i can get the dataset that's returned by _input_fn to work with returning the 2 output labels? So the tensor that's returned looks something like:
Feature_Tensor: {Col1_xf: Col1_transformedfeature_values, Col2_xf:
Col2_transformedfeature_values, Col3_xf:
Col3_transformedfeature_values}
Label_Tensor: {ClassificationA_xf: ClassA_encodedlabels,
ClassificationB_xf: ClassB_encodedlabels}
Would appreciate advice from the wider community of tfx!
Since the label key is optional, maybe instead of specifying it in the TensorflowDatasetOptions, instead you can use dataset.map afterwards and pass both labels after taking them from your dataset.
Haven't tested it but something like:
def _data_augmentation(feature_dict):
features = feature_dict[_transformed_name(x) for x in
_CATEGORICAL_FEATURE_KEYS]]
keys=[_transformed_name(x) for x in _CATEGORICAL_LABEL_KEYS]
return features, keys
def _input_fn(file_pattern: List[Text],
data_accessor: DataAccessor,
tf_transform_output: tft.TFTransformOutput,
batch_size: int = 200) -> tf.data.Dataset:
"""Helper function that Generates features and label dataset for tuning/training.
Args:
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
tf_transform_output: A TFTransformOutput.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
Returns:
A dataset that contains (features, indices) tuple where features is a
dictionary of Tensors, and indices is a single Tensor of label indices.
"""
dataset = data_accessor.tf_dataset_factory(
file_pattern,
dataset_options.TensorFlowDatasetOptions(
batch_size=batch_size,
tf_transform_output.transformed_metadata.schema)
dataset = dataset.map(_data_augmentation)
return dataset

Finding Input and output tensors from .pb file

I created a .pb model thanks to roboflow.ai and I'm now trying to convert the .pb file into .tflite so I can use it in an Android app I'm hoping to develop. I'm struggling to do the conversion because I have to put in my 'input' and 'output' tensors.
I found a script that gave me my input tensor as 'image_tensor' but it gives me my output tensors as:
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/Switch',
'raw_detection_boxes',
'MultipleGridAnchorGenerator/assert_equal_1/Assert/Assert',
'detection_boxes',
'detection_scores',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/SortByField/TopKV2',
'detection_multiclass_scores',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/SortByField/Assert/Assert',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/Switch_1',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/SortByField_1/Assert/Assert',
'detection_classes',
'num_detections',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/SortByField_1/TopKV2',
'Preprocessor/map/while/Switch_1',
'Preprocessor/map/while/Switch',
'raw_detection_scores'
I've tried all of this and different combinations of this but I'm unsure what I should be using (or if it even is the correct thing).
I'm trying to put this into the following code:
import tensorflow as tf
localpb = 'retrained_graph_eyes1za.pb'
tflite_file = 'retrained_graph_eyes1za.lite'
print("{} -> {}".format(localpb, tflite_file))
converter = tf.lite.TFLiteConverter.from_frozen_graph(
localpb,
['input'],
['final_result']
)
tflite_model = converter.convert()
open(tflite_file,'wb').write(tflite_model)
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
I'm using TensorFlow v1x as this is what roboflow.ai recommends.
Any help?

Way to print invalid filenames for generator in keras

I have a big dataframe that I pass to a
generator.flow_from_dataframe(df,...)
but when I run it, I have
UserWarning: Found 52 invalid image filename(s) in x_col="image". These filename(s) will be ignored.
.format(n_invalid, x_col)
There is a way to print these invalid image filenames or to understand their indexes in the df?
I had a similar error, bypassed it by using the flag validate_filenames=False in the flow_from_dataframe method
Currently, AFAIK there is no way to list the file names from the data-frame that do not map to the image directory, in Keras
You can write your python method to list the differences or there must be a 3rd party library which does the same
The issue can be fixed by specifying the absolute path for images within the dataframe.
Assuming :
The dataframe df contains two columns - Image(X) & Class(Y)
Images are stored in train_dir
(Image dataframe structure)
# Specifying absolute path for images in the data frame
abs_file_names = []
for file_name in df['Image']:
tmp = os.path.abspath(train_dir+os.sep+file_name)
abs_file_names.append(tmp)
# update dataframe
df['Image'] = abs_file_names
datagen = ImageDataGenerator(rescale=1./255.,validation_split=0.25)
a = datagen.flow_from_dataframe(
dataframe = df,
train_dir=None,
x_col="Image",
y_col="Class",
weight_col=None,
target_size=(150, 150),
color_mode="rgb",
classes=None,
class_mode="categorical",
batch_size=32,
shuffle=True,
seed=None,
save_to_dir=None,
save_prefix="",
save_format=None,
subset=None,
interpolation="nearest",
validate_filenames=True,
)
Useful resources for more info :
https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator#flow_from_dataframe
Keras flowFromDirectory get file names as they are being generated
https://github.com/keras-team/keras-preprocessing/blob/6701f27afa62712b34a17d4b0ff879156b0c7937/keras_preprocessing/image/dataframe_iterator.py#L267
https://github.com/keras-team/keras-preprocessing/issues/92
df['image'] = df['image']+'.jpg'
Use the above code to convert the names of the images into filenames before inputting "image" in x_col parameter in the flow_from_dataframe function. This is what worked for me.

LdaModel - random_state parameter not recognized - gensim

I'm using gensim's LdaModel, which, according to the documentation, has the parameter random_state. However, I'm getting an error that says:
TypeError: __init__() got an unexpected keyword argument 'random_state'
Without the random_state parameter, the function works as expected. So, the workflow looks like this for those that want to know what else is happening...
from gensim import corpora, models
import numpy as np
# pseudo code of text pre-processing all on "comments" variable
# stop words
# remove punctuation (optional)
# keep alpha only
# stemming
# get bigrams and integrate with corpus (gensim makes this very easy)
dictionary = corpora.Dictionary(comments)
corpus = [dictionary.doc2bow(comm) for comm in comments]
tfidf = models.TfidfModel(corpus) # change weights
corp_tfidf = tfidf[corpus] # apply them to corpus
# set random seed
random_seed = 135
state = np.random.RandomState(random_seed)
# train model
num_topics = 3
lda_mod = models.LdaModel(corp_tfidf, # corpus
num_topics=num_topics, # number of topics we want back
id2word=dictionary, # our id-word map
passes=10, # how many passes to take over the data
random_state=state) # reproduce the results
Which results in the error message above...
TypeError: __init__() got an unexpected keyword argument 'random_state'
I'd like to be able to recreate my results, if possible.
According to this, random_state parameter was added in the latest version (0.13.2). You can update your gensim installation with pip install gensim --upgrade. You might need to update scipy first, because it caused me problems.

Resources