How to save ParallelMapDataset? - python-3.x

I have an input dataset (let's name it ds), a function that passes in to encoder (model named embedder). I want to make a dataset of encodings and save it to file. What I tried to do:
Converter function:
def generate_embedding(image, label, embedder):
return (embedder(image)[0], label)
Converting:
embedding_ds = ds.map(lambda image, label: generate_embedding(image, label, embedder), num_parallel_calls=tf.data.AUTOTUNE)
Saving:
embedding_ds.save(path)
But I have a problem with embedding_ds, it's not tf.data.Dataset (which I expected), but tf.raw_ops.ParallelMapDataset, which don't have save method. Can anybody give an advice?
Looks like this problem is present on my tensorflow version (2.9.2) and not present on 2.11

Maybe update? In 2.11.0, it works:
import tensorflow as tf
ds = tf.data.Dataset.range(5)
tf.__version__ # 2.11.0
ds = ds.map(lambda e : (e + 3) % 5, num_parallel_calls=3)
ds.save('test') # works

Related

How to use varifocal loss in YOLOv5?

I'm a beginner in modifying YOLOv5 and I'd like to know how to detailed steps to use the varifocal loss from VarifocalNet and implement it to YOLOv5 (pytorch).
I putted a link here below which is the python file of the varifocal loss
Varifocal Loss
thank you in advance
copy builder and utils file from the following repository
https://github.com/hyz-xmaster/VarifocalNet
yow will find the builder file builder here and utils here. Then paste these two files in yolov5 "utils".
pip install mmcv you can find here
Pasts the following code Varifocal in the "loss.py" file. ( you can put below QFocalLoss Class).
Remove one dot (.) before builder in Varifocal code. Meaning that put from .builder import LOSSES instead from ..builder import LOSSES (because we put builder file in same folder)
Use these three lines:
Use this :
g = 2 # focal loss gamma
if g > 0:
BCEcls, BCEobj = VarifocalLoss(BCEcls), VarifocalLoss(BCEobj)
Instead of :
g = h['fl_gamma'] # focal loss gamma
if g > 0:
BCEcls, BCEobj = FocalLoss(BCEcls, g), FocalLoss(BCEobj, g)

MultiOutput Classification with TensorFlow Extended (TFX)

I'm quite new to TFX (TensorFlow Extended), and have been going through the sample tutorial on the TensorFlow portal to understand a bit more to apply it to my dataset.
In my scenario, instead of predicting a single label, the problem at hand requires me to predict 2 outputs (category 1, category 2).
I've done this using pure TensorFlow Keras Functional API and that works fine, but then am now looking to see if that can be fitted into the TFX pipeline.
Where i get the error, is at the Trainer stage of the pipeline, and where it throws the error is in the _input_fn, and i suspect it's because i'm not correctly splitting out the given data into (features, labels) tensor pair in the pipeline.
Scenario:
Each row of the input data comes in the form of
[Col1, Col2, Col3, ClassificationA, ClassificationB]
ClassificationA and ClassificationB are the categorical labels which i'm trying to predict using the Keras Functional Model
The output layer of the keras functional model looks like below, where there's 2 outputs that is joined to a single dense layer (Note: _xf appended to the end is just to illustrate that i've encoded the classes to int representations)
output_1 = tf.keras.layers.Dense(
TargetA_Class, activation='sigmoid',
name = 'ClassificationA_xf')(dense)
output_2 = tf.keras.layers.Dense(
TargetB_Class, activation='sigmoid',
name = 'ClassificationB_xf')(dense)
model = tf.keras.Model(inputs = inputs,
outputs = [output_1, output_2])
In the trainer module file, i've imported the required packages at the start of the module file >
import tensorflow_transform as tft
from tfx.components.tuner.component import TunerFnResult
import tensorflow as tf
from typing import List, Text
from tfx.components.trainer.executor import TrainerFnArgs
from tfx.components.trainer.fn_args_utils import DataAccessor, FnArgs
from tfx_bsl.tfxio import dataset_options
The current input_fn in the trainer module file looks like the below (by following the tutorial)
def _input_fn(file_pattern: List[Text],
data_accessor: DataAccessor,
tf_transform_output: tft.TFTransformOutput,
batch_size: int = 200) -> tf.data.Dataset:
"""Helper function that Generates features and label dataset for tuning/training.
Args:
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
tf_transform_output: A TFTransformOutput.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
Returns:
A dataset that contains (features, indices) tuple where features is a
dictionary of Tensors, and indices is a single Tensor of label indices.
"""
return data_accessor.tf_dataset_factory(
file_pattern,
dataset_options.TensorFlowDatasetOptions(
batch_size=batch_size,
#label_key=[_transformed_name(x) for x in _CATEGORICAL_LABEL_KEYS]),
label_key=_transformed_name(_CATEGORICAL_LABEL_KEYS[0]), _transformed_name(_CATEGORICAL_LABEL_KEYS[1])),
tf_transform_output.transformed_metadata.schema)
When i run the trainer component the error that comes up is:
label_key=_transformed_name(_CATEGORICAL_LABEL_KEYS[0]),transformed_name(_CATEGORICAL_LABEL_KEYS1)),
^ SyntaxError: positional argument follows keyword argument
I've also tried label_key=[_transformed_name(x) for x in _CATEGORICAL_LABEL_KEYS]) which also gives an error.
However, if i just pass in a single label key, label_key=transformed_name(_CATEGORICAL_LABEL_KEYS[0]) then it works fine.
FYI - _CATEGORICAL_LABEL_KEYS is nothing but a list which contains the names of the 2 outputs i'm trying to predict (ClassificationA, ClassificationB).
transformed_name is nothing but a function to return an updated name/key for the transformed data:
def transformed_name(key):
return key + '_xf'
Question:
From what i can see, the label_key argument for dataset_options.TensorFlowDatasetOptions can only accept a single string/name of label, which means it may not be able to output the dataset with multi labels.
Is there a way which i can modify the _input_fn so that i can get the dataset that's returned by _input_fn to work with returning the 2 output labels? So the tensor that's returned looks something like:
Feature_Tensor: {Col1_xf: Col1_transformedfeature_values, Col2_xf:
Col2_transformedfeature_values, Col3_xf:
Col3_transformedfeature_values}
Label_Tensor: {ClassificationA_xf: ClassA_encodedlabels,
ClassificationB_xf: ClassB_encodedlabels}
Would appreciate advice from the wider community of tfx!
Since the label key is optional, maybe instead of specifying it in the TensorflowDatasetOptions, instead you can use dataset.map afterwards and pass both labels after taking them from your dataset.
Haven't tested it but something like:
def _data_augmentation(feature_dict):
features = feature_dict[_transformed_name(x) for x in
_CATEGORICAL_FEATURE_KEYS]]
keys=[_transformed_name(x) for x in _CATEGORICAL_LABEL_KEYS]
return features, keys
def _input_fn(file_pattern: List[Text],
data_accessor: DataAccessor,
tf_transform_output: tft.TFTransformOutput,
batch_size: int = 200) -> tf.data.Dataset:
"""Helper function that Generates features and label dataset for tuning/training.
Args:
file_pattern: List of paths or patterns of input tfrecord files.
data_accessor: DataAccessor for converting input to RecordBatch.
tf_transform_output: A TFTransformOutput.
batch_size: representing the number of consecutive elements of returned
dataset to combine in a single batch
Returns:
A dataset that contains (features, indices) tuple where features is a
dictionary of Tensors, and indices is a single Tensor of label indices.
"""
dataset = data_accessor.tf_dataset_factory(
file_pattern,
dataset_options.TensorFlowDatasetOptions(
batch_size=batch_size,
tf_transform_output.transformed_metadata.schema)
dataset = dataset.map(_data_augmentation)
return dataset

Finding Input and output tensors from .pb file

I created a .pb model thanks to roboflow.ai and I'm now trying to convert the .pb file into .tflite so I can use it in an Android app I'm hoping to develop. I'm struggling to do the conversion because I have to put in my 'input' and 'output' tensors.
I found a script that gave me my input tensor as 'image_tensor' but it gives me my output tensors as:
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/Switch',
'raw_detection_boxes',
'MultipleGridAnchorGenerator/assert_equal_1/Assert/Assert',
'detection_boxes',
'detection_scores',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/SortByField/TopKV2',
'detection_multiclass_scores',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/SortByField/Assert/Assert',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/Switch_1',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/SortByField_1/Assert/Assert',
'detection_classes',
'num_detections',
'Postprocessor/BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/SortByField_1/TopKV2',
'Preprocessor/map/while/Switch_1',
'Preprocessor/map/while/Switch',
'raw_detection_scores'
I've tried all of this and different combinations of this but I'm unsure what I should be using (or if it even is the correct thing).
I'm trying to put this into the following code:
import tensorflow as tf
localpb = 'retrained_graph_eyes1za.pb'
tflite_file = 'retrained_graph_eyes1za.lite'
print("{} -> {}".format(localpb, tflite_file))
converter = tf.lite.TFLiteConverter.from_frozen_graph(
localpb,
['input'],
['final_result']
)
tflite_model = converter.convert()
open(tflite_file,'wb').write(tflite_model)
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
I'm using TensorFlow v1x as this is what roboflow.ai recommends.
Any help?

How add scalar to tensor in Keras or create tensor from scalar?

I need to somehow run someting like that:
x = Input(shape=(img_height, img_width, img_channels))
x1 = Add()([x, 127.5])
x2 = Multiply()(x1, -127.5])
But, error emerges:
ValueError: Layer add_1 was called with an input that isn't a symbolic tensor. Received type: <class 'float'>. Full input: [<tf.Tensor 'input_1:0' shape=(?, 400, 300, 3) dtype=float32>, 0.00784313725490196]. All inputs to the layer should be tensors.
I can't use Lambda() layer, because I need to convert final model into CoreML and I'll be unable to rewrite them in swift.
Is there any way to create Keras tensor from float?
Maybe there is a different solution for this problem?
UPD: backend is TensorFlow
Well, based on comments above I've tested 2 approaches. Custom layer was not an option, because I would need to write it in swift for conversion to CoreML model (and I do not know swift).
Additional input
There is no way to predefine input value, as far as I know, so I need to pass additional parameters on input, which is not very convinient.
Consider example code below:
input1 = keras.layers.Input(shape=(1,), tensor=t_b, name='11')
input2 = keras.layers.Input(shape=(1,))
input3 = keras.layers.Input(shape=(1,), tensor=t_a, name='22')
# x1 = keras.layers.Dense(4, activation='relu')(input1)
# x2 = keras.layers.Dense(4, activation='relu')(input2)
added = keras.layers.Add()([input1, input3]) # equivalent to added = keras.layers.add([x1, x2])
added2 = keras.layers.Add()([input2, added]) # equivalent to added = keras.layers.add([x1, x2])
# out = keras.layers.Dense(4)(added2)
model = keras.models.Model(inputs=[input1, input2, input3], outputs=added2)
If you will load that model in clean environment, than you actually will need to pass a 3 values to it: my_model.predict([np.array([1]), np.array([1]), np.array([1])]) or error will emerge.
CoreML tools
I was able to achieve desirable effect by using *_bias and image_scale parameters in importer function. Example below.
coreml_model = coremltools.converters.keras.convert(
model_path,
input_names='image',
image_input_names='image',
output_names=['cla','bo'],
image_scale=1/127.5, # divide matrix by value
# substract 1 from every value in matrix
red_bias=-1.0, # substract value from channel
blue_bias=-1.0,
green_bias=-1.0
)
If somebody knows how to predefine constant in Keras, which should not be loaded via input layer, please write how (tf.constant() solution is not working).

How to use get_operation_by_name() in tensorflow, from a graph built from a different function?

I'd like to build a tensorflow graph in a separate function get_graph(), and to print out a simple ops a in the main function. It turns out that I can print out the value of a if I return a from get_graph(). However, if I use get_operation_by_name() to retrieve a, it print out None. I wonder what I did wrong here? Any suggestion to fix it? Thank you!
import tensorflow as tf
def get_graph():
graph = tf.Graph()
with graph.as_default():
a = tf.constant(5.0, name='a')
return graph, a
if __name__ == '__main__':
graph, a = get_graph()
with tf.Session(graph=graph) as sess:
print(sess.run(a))
a = sess.graph.get_operation_by_name('a')
print(sess.run(a))
it prints out
5.0
None
p.s. I'm using python 3.4 and tensorflow 1.2.
Naming conventions in tensorflow are subtle and a bit offsetting at first.
The thing is, when you write
a = tf.constant(5.0, name='a')
a is not the constant op, but its output. Names of op outputs derive from the op name by adding a number corresponding to its rank. Here, constant has only one output, so its name is
print(a.name)
# `a:0`
When you run sess.graph.get_operation_by_name('a') you do get the constant op. But what you actually wanted is to get 'a:0', the tensor that is the output of this operation, and whose evaluation returns an array.
a = sess.graph.get_tensor_by_name('a:0')
print(sess.run(a))
# 5

Resources