OCR code written without custom loss function - keras

I am working on OCR model. my final goal is to convert OCR code into coreML and deploy it into ios.
I have looked and run a couple of the github source codes namely:
here
here
as you have a look on them they all implemented loss as a custom layer with lambda layer.
the problem start when I want to convert this to coreML.
my piece of the code to convert to CoreMl:
import coremltools
def convert_lambda(layer):
# Only convert this Lambda layer if it is for our swish function.
if layer.function == ctc_lambda_func:
params = NeuralNetwork_pb2.CustomLayerParams()
# The name of the Swift or Obj-C class that implements this layer.
params.className = "x"
# The desciption is shown in Xcode's mlmodel viewer.
params.description = "A fancy new loss"
return params
else:
return None
print("\nConverting the model:")
# Convert the model to Core ML.
coreml_model = coremltools.converters.keras.convert(
model,
# 'weightswithoutstnlrchangedbackend.best.hdf5',
input_names="image",
image_input_names="image",
output_names="output",
add_custom_layers=True,
custom_conversion_functions={"Lambda": convert_lambda},
)
but it raises error
Converting the model:
Traceback (most recent call last):
File "/home/sgnbx/Downloads/projects/CRNN-with-STN-master/CRNN_with_STN.py", line 201, in <module>
custom_conversion_functions={"Lambda": convert_lambda},
File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/coremltools/converters/keras/_keras_converter.py", line 760, in convert
custom_conversion_functions=custom_conversion_functions)
File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/coremltools/converters/keras/_keras_converter.py", line 556, in convertToSpec
custom_objects=custom_objects)
File "/home/sgnbx/anaconda3/envs/tf_gpu/lib/python3.6/site-packages/coremltools/converters/keras/_keras2_converter.py", line 255, in _convert
if input_names[idx] in input_name_shape_dict:
IndexError: list index out of range
Input name length mismatch
I am kind of not sure I can resolve this as I did not find anything relevant to this error to resolve.
In other hand most codes for OCR have Custom Loss function which probably again I face with the same problem.
So in the end I have two question:
Do you know how to resolve this error
my main question do you know any source code for OCR which is in KERAS (As i have to convert it to coreMl) and do not have custom loss function so it will be ok converting to CoreMl without problem?
Thanks in advance:)
just to make my question thorough:
this is the custom loss function in the source I am working:
def ctc_lambda_func(args):
iy_pred, ilabels, iinput_length, ilabel_length = args
# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:
iy_pred = iy_pred[:, 2:, :] # no such influence
return backend.ctc_batch_cost(ilabels, iy_pred, iinput_length, ilabel_length)
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')
([fc_2, labels, input_length, label_length])
and then use it in compile:
model.compile(loss={'ctc': lambda y_true, y_pred: y_pred}, optimizer=sgd)

CoreML doesn't allow you train model, so it's not important to have a loss function or not. If you only want to use CRNN as predictor on iOS , you should just convert base_model in second link.

Related

caffe2 inference a onnx model , happend IndexError: Input 475 is undefined

Error:
when I use caffe2 for pretraind model. the model is from https://github.com/onnx/models/blob/master/vision/classification/vgg/model/vgg16-7.onnx
the model I use is pretrained model,
I do not change the model, and not use https://github.com/onnx/optimizer
the code is:
import caffe2
model=onnx.load(vgg16-7.onnx)
prepared_backend=caffe2.python.onnx.backend.prepare(model)
then an error happend:
WARNING:root:This caffe2 python run failed to load cuda module:No module named 'caffe2.python.caffe2_pybind11_state_gpu',and AMD hip module:No module named 'caffe2.python.caffe2_pybind11_state_hip'.Will run in CPU only mode.
WARNING: ONNX Optimizer has been moved to https://github.com/onnx/optimizer.
All further enhancements and fixes to optimizers will be done in this new repo.
The optimizer code in onnx/onnx repo will be removed in 1.9 release.
Traceback (most recent call last):
File "test.py", line 20, in
init_net, predict_net = c2.onnx_graph_to_caffe2_net(onnx_model_proto)
File "/home/eeodev/.local/lib/python3.6/site-packages/caffe2/python/onnx/backend.py", line 921, in onnx_graph_to_caffe2_net
return cls._onnx_model_to_caffe2_net(model, device=device, opset_version=opset_version, include_initializers=True)
File "/home/eeodev/.local/lib/python3.6/site-packages/caffe2/python/onnx/backend.py", line 876, in _onnx_model_to_caffe2_net
onnx_model = onnx.utils.polish_model(onnx_model)
File "/usr/local/lib64/python3.6/site-packages/onnx/utils.py", line 24, in polish_model
model = onnx.optimizer.optimize(model)
File "/usr/local/lib64/python3.6/site-packages/onnx/optimizer.py", line 55, in optimize
optimized_model_str = C.optimize(model_str, passes)
IndexError: Input 475 is undefined!
who can tell the solution?
another,if it is a pytorch model,when conver to onnx model ,we can use torch.onnx.export(model, input, 'model.onnx', verbose=True, keep_initializers_as_inputs=True), by keep_initializers_as_inputs=True , use caffe2 load model will not happen than error. but the model I used is pretrained model ,how to use this method?
I believe it is related to IR gap issue: https://github.com/onnx/onnx/issues/2902.
Currently the deprecated ONNX optimizer in ONNX repo cannot deal with ONNX model which IR_VERSION >=4 if the initializer are not included in model's input.
The workaround is to use the following script to let your model include input from initializer (contributed by #TMVector in GitHub):
def add_value_info_for_constants(model : onnx.ModelProto):
"""
Currently onnx.shape_inference doesn't use the shape of initializers, so add
that info explicitly as ValueInfoProtos.
Mutates the model.
Args:
model: The ModelProto to update.
"""
# All (top-level) constants will have ValueInfos before IRv4 as they are all inputs
if model.ir_version < 4:
return
def add_const_value_infos_to_graph(graph : onnx.GraphProto):
inputs = {i.name for i in graph.input}
existing_info = {vi.name: vi for vi in graph.value_info}
for init in graph.initializer:
# Check it really is a constant, not an input
if init.name in inputs:
continue
# The details we want to add
elem_type = init.data_type
shape = init.dims
# Get existing or create new value info for this constant
vi = existing_info.get(init.name)
if vi is None:
vi = graph.value_info.add()
vi.name = init.name
# Even though it would be weird, we will not overwrite info even if it doesn't match
tt = vi.type.tensor_type
if tt.elem_type == onnx.TensorProto.UNDEFINED:
tt.elem_type = elem_type
if not tt.HasField("shape"):
# Ensure we set an empty list if the const is scalar (zero dims)
tt.shape.dim.extend([])
for dim in shape:
tt.shape.dim.add().dim_value = dim
# Handle subgraphs
for node in graph.node:
for attr in node.attribute:
# Ref attrs refer to other attrs, so we don't need to do anything
if attr.ref_attr_name != "":
continue
if attr.type == onnx.AttributeProto.GRAPH:
add_const_value_infos_to_graph(attr.g)
if attr.type == onnx.AttributeProto.GRAPHS:
for g in attr.graphs:
add_const_value_infos_to_graph(g)
return add_const_value_infos_to_graph(model.graph)

Is it possible to replace keras Lambda layers with native CoreML ones to convert the model?

For the first time I faced a need to convert mu keras model to coreml. This can be done via coremltools package,
import coremltools
import keras
model = Model(...) # keras
coreml_model = coremltools.converters.keras.convert(model,
input_names="input_image_NHWC",
output_names="output_image_NHWC",
image_scale=1.0,
model_precision='float32',
use_float_arraytype=True,
custom_conversion_functions={ "Lambda": convert_lambda },
input_name_shape_dict={'input_image_NHWC': [None, 384, 384, 3]}
)
However, I have two lambda layers, where the first one is depth-to-space (pixelshuffle) and another one is scaler:
def tf_upsampler(x):
return tf.nn.depth_to_space(x, 4)
def mulfunc(x, beta=0.2):
return beta*x
...
x = Lambda(tf_upsampler)(x)
...
x = Lambda(mulfunc)(x)
The only advice I found was as far as I understand, to use custom layer with the need to implement my layer in Swift code later. Something like this with MyPixelShuffle and MyScaleLayer to be implemented somehow as classes in XCode project (?):
def convert_lambda(layer):
# Only convert this Lambda layer if it is for our swish function.
if layer.function == tf_upsampler:
params = NeuralNetwork_pb2.CustomLayerParams()
# The name of the Swift or Obj-C class that implements this layer.
params.className = "MyPixelShuffle"
# The desciption is shown in Xcode's mlmodel viewer.
params.description = "pixelshuffle"
params.parameters["blockSize"].intValue = 4
return params
elif layer.function == mulfunc:
# https://stackoverflow.com/questions/47987777/custom-layer-with-two-parameters-function-on-core-ml
params = NeuralNetwork_pb2.CustomLayerParams()
# The name of the Swift or Obj-C class that implements this layer.
params.className = "MyScaleLayer"
params.description = "scaling input"
# HERE!! This is important.
params.parameters["scale"].doubleValue = 0.2
# The desciption is shown in Xcode's mlmodel viewer.
params.description = "multiplication by constant"
return params
However, I found that CoreML actually has layers I need, they can be found as ScaleLayer and ReorganizeDataLayer
How can I use those native layers to replace lambdas in keras model? Is it possible to edit coreML protobuf for the network? Or if there are Swift/OBj-C classes for them, how they are called?
Can it be done via deleting/adding layers with coremltools.models.neural_network.NeuralNetworkBuilder ?
UPDATE:
I found that keras converter actually invokes Neural network builder to add different layers. Builder has the layer builder.add_reorganize_data I need. Now it's a question how to replace custom layers in the model. I can load it into builder and isnpect layers:
coreml_model_path = 'mymodel.mlmodel'
spec = coremltools.models.utils.load_spec(coreml_model_path)
builder = coremltools.models.neural_network.NeuralNetworkBuilder(spec=spec)
builder.inspect_layers(last=10)
[Id: 417], Name: lambda_10 (Type: custom)
Updatable: False
Input blobs: ['up1_output']
Output blobs: ['lambda_10_output']
It's much simpler to do something like this:
def convert_lambda(layer):
if layer.function == tf_upsampler:
params = NeuralNetwork_pb2.ReorganizeDataLayerParams()
params.fillInTheOtherPropertiesHere = someValue
return params
...etc..
In other words, you don't have to return a custom layer if some existing layer type already does what you want.
Okay, seems I have found a way to go. I created a virtual environment with separate copy of coremltools and edited _convert() method in _keras2_converter.py by adding the following code:
for iter, layer in enumerate(graph.layer_list):
keras_layer = graph.keras_layer_map[layer]
print("%d : %s, %s" % (iter, layer, keras_layer))
if isinstance(keras_layer, _keras.layers.wrappers.TimeDistributed):
keras_layer = keras_layer.layer
converter_func = _get_layer_converter_fn(keras_layer, add_custom_layers)
input_names, output_names = graph.get_layer_blobs(layer)
# this may be none if we're using custom layers
if converter_func:
converter_func(builder, layer, input_names, output_names,
keras_layer, respect_trainable)
else:
if _is_activation_layer(keras_layer):
import six
if six.PY2:
layer_name = keras_layer.activation.func_name
else:
layer_name = keras_layer.activation.__name__
else:
layer_name = type(keras_layer).__name__
if layer_name in custom_conversion_functions:
custom_spec = custom_conversion_functions[layer_name](keras_layer)
else:
custom_spec = None
if layer.find('tf_up') != -1:
print('TF_UPSCALE found')
builder.add_reorganize_data(layer, input_names[0], output_names[0], mode='DEPTH_TO_SPACE', block_size=4)
elif layer.find('mulfunc') != -1:
print('SCALE found')
builder.add_scale(layer, W=0.2, b=0, has_bias=False, input_name=input_names[0], output_name=output_names[0])
else:
builder.add_custom(layer, input_names, output_names, custom_spec)
The trigger is layer name. In keras I use model.load_weights(by_name=True) and the following marking of my lambdas:
x = Lambda(mulfunc, name=scope+'mulfunc')(x)
x = Lambda(tf_upsampler,name='tf_up')(x)
Now the model at least has layers I need:
[Id: 417], Name: tf_up (Type: reorganizeData)
Updatable: False
Input blobs: ['up1_output']
Output blobs: ['tf_up_output']
Now it's time to validate on my VBox MacOS, will post what I've got.
UPDATE:
I don't see errors regarding my replaced Lambda layers, but have another one not allowing me to predict:
Layer 'concatenate_1' type 320 has 1 inputs but expects at least 2
I think it's related to the fact that Keras receives single input to concatenate layer which is a list of inputs. Looking how to fix this
UPDATE 2:
Tried to hotfix this by using add_concat_nd(self, name, input_names, output_name, axis) function of builder for my concatenate layers: got error during inference that this layer type is usupported (?!) :
if converter_func:
if layer.find('concatenate') != -1:
print('CONCATENATE FOUND')
builder.add_concat_nd(layer, input_names, output_names[0], axis=3)
else:
converter_func(builder, layer, input_names, output_names,
keras_layer, respect_trainable)
Unsupported layer type (CoreML.Specification.NeuralNetworkLayer)
UPDATE 4:
Found fix for this, and changed the way how builder is initialized:
builder = _NeuralNetworkBuilder(input_features, output_features, mode = mode,
use_float_arraytype=use_float_arraytype, disable_rank5_shape_mapping=True)
Now error message is gone, but I have problems with XCode version: model has version 4, while my Xcode supports 3. Will seek to update my VM.
"CoreML survival guide" pdf suggests in this case:
pip install -U git+https://github.com/apple/coremltools.git
For example, if loading a model using coremltools gives an error such as the following, then
try installing the latest coremltools directly from the GitHub repo.
Error compiling model: "Error reading protobuf spec. validator error:
The .mlmodel supplied is of version 3, intended for a newer version of
Xcode. This version of Xcode supports model version 2 or earlier.
UPDATE:
Updated from git. Have error:
No module named 'coremltools.libcoremlpython'
Looks like latest git is broken :(
Damn, it seems I need macos 10.15 and xcode 11
UPDATE 5:
Still fighting with bug on 10.15. Found that
coremltools somehow deduplicates inputs to Concatenate layer, so if you have in your keras code something like Concatenate()([x,x]) you will have concatenate layer with 1 inputs in coreml and error. To fix it I try to further modify the code above:
if layer.find('concatenate') != -1:
print('CONCATENATE FOUND', len(input_names))
if len(input_names) == 1:
input_names = [input_names[0],input_names[0]]
builder.add_concat_nd(layer, input_names, output_names[0], axis=3)
[I faced this error: input layer 'conv_in' of type 'Convolution' has input rank 3 but expects rank at least 4](coremltools: how to properly use NeuralNetworkMultiArrayShapeRange? which seems to be caused by coreml making input 3-dimensional CHW, while it must be 4 NWHC (?). Currently playing with the following
spec = coreml_model._spec
# fixing input shape
# https://stackoverflow.com/questions/59765376/coreml-model-convert-imagetype-model-input-to-multiarray
spec.description.input[0].type.multiArrayType.shape.extend([1, 384, 384, 3])
del spec.description.input[0].type.multiArrayType.shape[0]
del spec.description.input[0].type.multiArrayType.shape[0]
del spec.description.input[0].type.multiArrayType.shape[0]
coremltools.utils.save_spec(spec, "my.mlmodel")
Getting invalid blob shape from model internals

librosa.util.exceptions.ParameterError: Invalid shape for monophonic audio: ndim=2, shape=(172972, 2)

Please somebody help me to solve this
I was following this tutorial:
https://data-flair.training/blogs/python-mini-project-speech-emotion-recognition/
And used their dataset which they took from the RAVDESS Dataset and lowered the sample rate of them. I can train using this data easily. But when I use Original data from here:
https://zenodo.org/record/1188976
Just "Audio_Speech_Actors_01-24.zip" and try to train model it gives me below error:
Traceback (most recent call last):
File "C:/Users/raj.pandey/Desktop/speech-emotion-recognition/main.py", line 64, in <module>
x_train, x_test, y_train, y_test = load_data(test_size=0.20)
File "C:/Users/raj.pandey/Desktop/speech-emotion-recognition/main.py", line 57, in load_data
feature = extract_feature(file, mfcc=True, chroma=True, mel=True)
File "C:/Users/raj.pandey/Desktop/speech-emotion-recognition/main.py", line 32, in extract_feature
stft = np.abs(librosa.stft(X))
File "C:\Users\raj.pandey\Desktop\speech-emotion-recognition\lib\site-packages\librosa\core\spectrum.py", line 215, in stft
util.valid_audio(y)
File "C:\Users\raj.pandey\Desktop\speech-emotion-recognition\lib\site-packages\librosa\util\utils.py", line 268, in valid_audio
'ndim={:d}, shape={}'.format(y.ndim, y.shape))
librosa.util.exceptions.ParameterError: Invalid shape for monophonic audio: ndim=2, shape=(172972, 2)
The tutorial provided by the trains at the same dataset but just that they have lowered the sample rate. Why isn't it running on the original one?
Does it have to do anything with this in the code:
X = sound_file.read(dtype="float32")
I have was also just out of curiosity tried to predict from a .mp3 file and it resulted in an error. Then I converted that .mp3 file in wav and tried but still gives error in the title.
How to solve this error and make it train on the original data? If it starts training on original then I think it can predict on the .mp3 to wav converted file.
Below is the code that I am using:
import librosa
import soundfile
import os
import glob
import pickle
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
# DataFlair - Emotions in the RAVDESS dataset
emotions = {
'01': 'neutral',
'02': 'calm',
'03': 'happy',
'04': 'sad',
'05': 'angry',
'06': 'fearful',
'07': 'disgust',
'08': 'surprised'
}
# DataFlair - Emotions to observe
observed_emotions = ['calm', 'happy', 'fearful', 'disgust']
# DataFlair - Extract features (mfcc, chroma, mel) from a sound file
def extract_feature(file_name, mfcc, chroma, mel):
with soundfile.SoundFile(file_name) as sound_file:
X = sound_file.read(dtype="float32")
sample_rate = sound_file.samplerate
if chroma:
stft = np.abs(librosa.stft(X))
result = np.array([])
if mfcc:
mfccs = np.mean(librosa.feature.mfcc(y=X, sr=sample_rate, n_mfcc=40).T, axis=0)
result = np.hstack((result, mfccs))
if chroma:
chroma = np.mean(librosa.feature.chroma_stft(S=stft, sr=sample_rate).T, axis=0)
result = np.hstack((result, chroma))
if mel:
mel = np.mean(librosa.feature.melspectrogram(X, sr=sample_rate).T, axis=0)
result = np.hstack((result, mel))
return result
# DataFlair - Load the data and extract features for each sound file
def load_data(test_size=0.2):
x, y = [], []
for file in glob.glob("C:\\Users\\raj.pandey\\Desktop\\speech-emotion-recognition\\Dataset\\Actor_*\\*.wav"):
# for file in glob.glob("C:\\Users\\raj.pandey\\Desktop\\speech-emotion-recognition\\Dataset\\newactor\\*.wav"):
file_name = os.path.basename(file)
emotion = emotions[file_name.split("-")[2]]
if emotion not in observed_emotions:
continue
feature = extract_feature(file, mfcc=True, chroma=True, mel=True)
x.append(feature)
y.append(emotion)
return train_test_split(np.array(x), y, test_size=test_size, random_state=9)
# DataFlair - Split the dataset
x_train, x_test, y_train, y_test = load_data(test_size=0.20)
# DataFlair - Get the shape of the training and testing datasets
# print((x_train.shape[0], x_test.shape[0]))
# DataFlair - Get the number of features extracted
# print(f'Features extracted: {x_train.shape[1]}')
# DataFlair - Initialize the Multi Layer Perceptron Classifier
model = MLPClassifier(alpha=0.01, batch_size=256, epsilon=1e-08, hidden_layer_sizes=(300,), learning_rate='adaptive',
max_iter=500)
# DataFlair - Train the model
model.fit(x_train, y_train)
# print(model.fit(x_train, y_train))
# DataFlair - Predict for the test set
y_pred = model.predict(x_test)
# print("This is y_pred: ", y_pred)
# DataFlair - Calculate the accuracy of our model
accuracy = accuracy_score(y_true=y_test, y_pred=y_pred)
# DataFlair - Print the accuracy
# print("Accuracy: {:.2f}%".format(accuracy * 100))
# Predicting random files
tar_file = "C:\\Users\\raj.pandey\\Desktop\\speech-emotion-recognition\\Dataset\\newactor\\pls-hold-while-try.wav"
new_feature = extract_feature(tar_file, mfcc=True, chroma=True, mel=True)
data = []
data.append(new_feature)
data = np.array(data)
z_pred = model.predict(data)
print("This is output: ", z_pred)
The dataset provided by the tutorial to train was this: https://drive.google.com/file/d/1wWsrN2Ep7x6lWqOXfr4rpKGYrJhWc8z7/view
The original dataset you can get from here(which isn't working with the program):https://zenodo.org/record/1188976 (Audio_speech_actor one)
In predicting random files if you put any .wav files with a speech in it, it results in an error. And if you try text to speech converter and get the .wav and pass it here it will always say "fearfull". I have tried converting a .mp3 to .wav to get it to work nicely but nope still an error.
Anyone checked yet how can I get it working?
I've just ran into the same problem. For anyone reading this that prefers not to delete the stereo files, it is possible to convert them to mono using the command line tool ffmpeg:
ffmpeg -i stereo_file_name.wav -ac 1 mono_file_name.wav
Link to ffmpeg
Related Stack Overflow Post
from pydub import AudioSegment
file_name=os.path.basename(file)
#converting stereo audio to mono
sound = AudioSegment.from_wav(file)
sound = sound.set_channels(1)
sound.export(file, format="wav")
emotion=emotions[file_name.split("-")[2]]
if emotion not in observed_emotions:
continue
feature=extract_feature(file, mfcc=True, chroma=True, mel=True)
x.append(feature)
y.append(emotion)
return train_test_split(np.array(x), y, test_size=test_size, random_state=9)
Cause the audio file has 2 audio chanle, it must be only one audio chanle.
I'm working on the same dataset and got the same error as well. What I did was convert the audio to mono using an online converter https://convertio.co/ by following this website's directions https://videoconvert.minitool.com/video-converter/stereo-to-mono.html (point 1, convertio)
array(['fearful'], dtype='<U7')
The above line is my output too, it predicts it as fearful, might be because of the accuracy (mine is 73.96%, but it varies)
The tutorial provided by the trains at the same dataset but just that they have lowered the sample rate. Why isn't it running on the original one?
Even tho people already gave an answer to this question, The author or the authors of that tutorial didn't specify the fact that the dataset posted on their Google Drive have all audio tracks with mono channels while in the original one there are some audio tracks that are in stereo channels.
As already Arryan Sinha showed, just use the package pydub and the job is done.
Other than this, I would suggest to not give so much attention to that tutorial because the results from the classifier are, most of the time, about 50% of accuracy, which is not great. To verify effectively if the classifier is good try to print a confusion matrix. That surely helps to see if a classifier is good or not.
Some of the audio files are stereo,(and code needs mono) those are causing break. Removing those files from dataset eliminates this error. In load_data(), add a line, print(file). It will tell you which files are breaking the code, then just remove them.
def load_data(test_size=0.2):
x,y=[],[]
for file in glob.glob("\Actor_*\\*.wav"):
file_name=os.path.basename(file)
print(file)
emotion=emotions[file_name.split("-")[2]]
...
I found 4 files that are causing this :
Actor_01/03-01-02-01-01-02-01.wav
Actor_05/03-01-02-01-02-02-05.wav
Actor_20/03-01-03-01-02-01-20.wav
Actor_20/03-01-06-01-01-02-20.wav

error in converting tensor to numpy array

I'm trying to convert input_image which is a tensor to numpy array.Following the already answered questions here and several others that suggested to use input_image.eval() or equivalently sess.run() for this conversion, I did the same, but it throws an error and apparently expects a feed_dict value for the sess.run(). But since here I'm not trying to run an operation dependent on unknown values, I don't see the need for the feed_dict here because all I'm doing here is just conversion.
Besides, just so as to check I also tried converting a tf.constant([1,2,3]) value right above it using the same method and it got successfully compiled despite its data type being the same as input_image. Here's my code which is the part of larger script:
def call(self, x):
input_image = Input(shape=(None, None, 3))
print(input_image.shape)
print(type(tf.constant([1,2,3])))
print(type(input_image))
print(type(K.get_session().run(tf.constant([1,2,3]))))
print(type(K.get_session().run(input_image)))
and here's the error:
(?, ?, ?, 3)
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'numpy.ndarray'>
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,?,?,3]
[[{{node input_1}}]]
[[input_1/_1051]]
(1) Invalid argument: You must feed a value for placeholder tensor 'input_1' with dtype float and shape [?,?,?,3]
[[{{node input_1}}]]
0 successful operations.
0 derived errors ignored.
I wonder why the former would work and the latter won't.
There is no such thing as "converting" a symbolic tensor to a numpy array, as the latter cannot hold the same kind of information as the former.
When you use eval() or session.run(), what you are doing is evaluating a symbolic expression to get a numerical result, which is a numpy array, but this is not a conversion. Evaluating an expression might or might not require additional input data (that's what the feed_dict is for), depending on the expression.
Evaluating a constant (tf.constant) does not require any input data, but evaluating your other expression does require the input data, so you cannot "convert" this to a numpy array.
Just adding to (or elaborating on) what #MatiasValdenegro said,
TensorFlow follows something called graph execution (or define-then-run). In other words, when you write a TensorFlow program it defines something called a data-flow graph which shows how the operations you defined are related to each other. And then you execute bits and pieces of that graph depending on the results you're after.
Let's consider two examples. (I am switching to a simple TensorFlow program instead of Keras bits as it makes things more clear - After all K.get_session() returns a Session object).
Example 1
Say you have the following program.
import tensorflow as tf
a = tf.placeholder(shape=[2,2], dtype=tf.float32)
b = tf.constant(1, dtype=tf.float32)
c = a * b
# Wrong: This is what you're doing essentially when you do sess.run(input_image)
with tf.Session() as sess:
print(sess.run(c))
# Right: You need to feed values that c is dependent on
with tf.Session() as sess:
print(sess.run(c, feed_dict={a: np.array([[1,2],[2,3]])}))
Whenever a resulting tensor (e.g. c) is dependent on a placeholder you cannot execute it and get the result without feeding values to all the dependent placeholders.
Example 2
When you define a tf.constant(1) this is not dependent on anything. In other words you don't need a feed_dict and can directly run eval() or sess.run() on it.
Update: Further explanation on why you need a feed_dict for input_image
TLDR: You need a feed_dict because your resulting Tensor is produced by an Input layer.
Your input_image is basically the resulting tensor you get by feeding something to the Input layer. Usually in Keras, you are not exposed to the internal placeholder level details. But you would do that via using model.fit() or model.evaluate(). You can see that Keras Input layer in fact uses a placeholder by analysing this line.
Hope I made my point clear that you do need to feed in a value to the placeholder to successfully evaluate the output of an Input layer. Because that basically holds a placeholder.
Update 2: How to feed to your Input layer
So, appears you can use feed_dict with Keras Input layer in the following manner. Instead of defining shape argument you straight away pass a placeholder to the tensor argument, which will bypass the internal placeholder creation in the layer.
from tensorflow.keras.layers import InputLayer
import numpy as np
import tensorflow.keras.backend as K
x = tf.placeholder(shape=[None, None, None, 3], dtype=tf.float32)
input_image = Input(tensor=x)
arr = np.array([[[[1,1,1]]]])
print(arr.shape)
print(K.get_session().run(input_image, feed_dict={x: arr}))

How to train your model in deeppavlov (NER) Python 3

First of all, sorry for any newbie mistakes that I've made. But I couldn't figure out and couldn't find a source specifically for deeppavlov (NER) library. I'm trying to train ner_ontonotes_bert_mult as described here. I guess it can be trained from its checkpoint to make it recognize some specific patterns like;
"Round 23/22; 24,9 x 12,2 x 12,3"
as
[[['Round', '23/22', ';', '24,9 x 12,2 x 12,3']], [['B-PRODUCT', 'I-PRODUCT', 'B-QUANTITY']]]
My questions are (before I dig into details):
Is it possible? And I realized I can't use samples like " Round 23/22; 24,9 x 12,2 x 12,3 ". I need them to be in full sentences.
Where can I find more info about it specifically related to deeppavlov's model(s)?
How can I train pre-trained deeppavlov model to recognize my custom patterns?
I don't even understand if it is possible but I've decided to give it go and prepared 3 .txt files as "train.txt", "test.txt" and "validation.txt" as described in deeppovlov web page. And I put them under the folder '~/.deeppavlov/downloads/ontonotes/ner_ontonotes_bert_mult'. My dataset looks like this:
Round B-PRODUCT
23/22 I-PRODUCT
24,9 x 12,2 x 12,3 B-QUANTITY
Ring B-PRODUCT
HDFAA I-PRODUCT
12,7 x 10 B-QUANTITY
and so on... This is the code I am trying to train it:
import os
# Force tensorflow to use CPU instead of GPU.
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
from deeppavlov import configs, train_model
from deeppavlov.core.commands.utils import parse_config
config_dict = parse_config(configs.ner.ner_ontonotes_bert_mult)
print(config_dict['dataset_reader']['data_path'])
from deeppavlov import configs, train_model
ner_model = train_model(configs.ner.ner_ontonotes_bert_mult)
But I am getting this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [3] rhs shape= [37]
[[{{node save/Assign_280}}]]
Full traceback:
2019-09-26 15:50:27.63 ERROR in 'deeppavlov.core.common.params'['params'] at line 110: Exception in <class 'deeppavlov.models.bert.bert_ner.BertNerModel'>
Traceback (most recent call last):
File "/home/custom_user/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/home/custom_user/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/custom_user/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [3] rhs shape= [37]
[[{{node save/Assign_280}}]]
UPDATE 2:
And I realized I can't use samples like " Round 23/22; 24,9 x 12,2 x 12,3 ". I need them to be in full sentences.
UPDATE:
It seems like this is happening due to my dataset. My custom dataset only has 3 tags (B-PRODUCT, I-PRODUCT and B-QUANTITY) but the pre-trained model has 37 of them. All available tags can be found here under the sentence of "The list of available tags and their descriptions are presented below.". 18 main tags(with B and I 36 tags), and O tag (“O” means the absence of entity.)). Total of all of the 37 tags needs to be present in the dataset. I was able to pass that error by adding dummy sentences by tagging them all with the missing tags. This is a terrible workaround since I'm willingly disrupting my own data-set. I'm still looking for a 'logical' way to train...
PS: Now I am getting this error.
Traceback (most recent call last):
File "/home/custom_user/.PyCharm2019.2/config/scratches/scratch_9.py", line 13, in <module>
ner_model = train_model(configs.ner.ner_ontonotes_bert_mult)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/__init__.py", line 31, in train_model
train_evaluate_model_from_config(config, download=download, recursive=recursive)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/core/commands/train.py", line 121, in train_evaluate_model_from_config
trainer.train(iterator)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/core/trainers/nn_trainer.py", line 294, in train
self.train_on_batches(iterator)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/core/trainers/nn_trainer.py", line 234, in train_on_batches
self._validate(iterator)
File "/home/custom_user/.local/lib/python3.6/site-packages/deeppavlov/core/trainers/nn_trainer.py", line 150, in _validate
metrics = list(report['metrics'].items())
AttributeError: 'NoneType' object has no attribute 'items'
There are at least two problems here:
1. instead of validation.txt there should be a valid.txt file;
2. you are trying to retrain a model that was pretrained on a different dataset with a different set of tags, it's not necessary.
To train your model from scratch you can do something like:
import json
from deeppavlov import configs, build_model, train_model
with configs.ner.ner_ontonotes_bert_mult.open(encoding='utf8') as f:
ner_config = json.load(f)
ner_config['dataset_reader']['data_path'] = '~/my_data_dir/' # directory with train.txt, valid.txt and test.txt files
ner_config['metadata']['variables']['NER_PATH'] = '~/where_to_save_the_model/'
ner_config['metadata']['download'] = [ner_config['metadata']['download'][-1]] # do not download the pretrained ontonotes model
ner_model = train_model(ner_config, download=True)
The other thing that could go wrong is tokenization: "Round 23/22; 24,9 x 12,2 x 12,3" will be split by the model to ['Round', '23', '/', '22', ';', '24', ',', '9', 'x', '12', ',', '2', 'x', '12', ',', '3'] and not ['Round', '23/22', ';', '24,9 x 12,2 x 12,3'].
But you can tokenize your texts beforehand:
ner_model([['Round', '23/22', ';', '24,9 x 12,2 x 12,3']])
I tried deeppavlov training, and successfully trained the 'ner' model
I also got the same error at first while training, then I overcome by researching more about it
things to know before training -
-> you can find the 'ner_ontonotes_bert_multi.json' config file link in deeppavlov doc, which gives the dataset path, pretrained model path , dataset_reader and chain pipe to train
-> there is a pretrained model in the directory mentioned in the 'config' ,by default it is inside 'C:/users/{user_name}/.deeppavlov/' is the root directory and pretrained models are gonna store in 'models' subdirectory
-> when you started training the already trained model is gonna be modified which means, training just try to improve the pre-trained model
so to train and build your own model (by scratch), simply delete the 'models' subdirectory from the '.deeppavlov' path and execute the training

Resources