I am trying to apply sequence-to-sequence modelling to EEG data. The encoding works just fine, but getting the decoding to work is proving problematic. The input-data has the shape None-by-3000-by-31, where the second dimension is the sequence-length.
The encoder looks like this:
initial_state = lstm_sequence_encoder.zero_state(batchsize, dtype=self.model_precision)
encoder_output, state = dynamic_rnn(
cell=LSTMCell(32),
inputs=lstm_input, # shape=(None,3000,32)
initial_state=initial_state, # zeroes
dtype=lstm_input.dtype # tf.float32
)
I use the final state of the RNN as the initial state of the decoder. For training, I use the TrainingHelper:
training_helper = TrainingHelper(target_input, [self.sequence_length])
training_decoder = BasicDecoder(
cell=lstm_sequence_decoder,
helper=training_helper,
initial_state=thought_vector
)
output, _, _ = dynamic_decode(
decoder=training_decoder,
maximum_iterations=3000
)
My troubles start when I try to implement inference. Since I am using non-sentence data, I do not need to tokenize or embed, because the data is essentially embedded already. The InferenceHelper class seemed the best way to achieve my goal. So this is what I use. I'll give my code then explain my problem.
def _sample_fn(decoder_outputs):
return decoder_outputs
def _end_fn(_):
return tf.tile([False], [self.lstm_layersize]) # Batch-size is sequence-length because of time major
inference_helper = InferenceHelper(
sample_fn=_sample_fn,
sample_shape=[32],
sample_dtype=target_input.dtype,
start_inputs=tf.zeros(batchsize_placeholder, 32), # the batchsize varies
end_fn=_end_fn
)
inference_decoder = BasicDecoder(
cell=lstm_sequence_decoder,
helper=inference_helper,
initial_state=thought_vector
)
output, _, _ = dynamic_decode(
decoder=inference_decoder,
maximum_iterations=3000
)
The Problem
I don't know what the shape of the inputs should be. I know the start-inputs should be zero because it is the first time-step. But this throws errors; it expects the input to be (1,32).
I also thought I should pass the output of each time-step unchanged to the next. However, this raises problems at run-time: the batch-size varies, so the shape is partial. The library throws an exception at this as it tries to convert the start_input to a tensor:
...
self._start_inputs = ops.convert_to_tensor(
start_inputs, name='start_inputs')
Any ideas?
This is a lesson in poor documentation.
I fixed my problem, but failed to address the variable batch-size problem.
The _end_fn was causing problems I was unaware of. I also managed to work out what the appropriate fields are for the InferenceHelper. I've given the fields names in case anyone needs guidance in future
def _end_fn(_):
return tf.tile([False], [batchsize])
inference_helper = InferenceHelper(
sample_fn=_sample_fn,
sample_shape=[lstm_number_of_units], # In my case, 32
sample_dtype=tf.float32, # Depends on the data
start_inputs=tf.zeros((batchsize, lstm_number_of_units)),
end_fn=_end_fn
)
As for the batch-size problem, there are two things I'm considering:
Changing the internal state of my model object. My TensorFlow computation graph is built inside a class. A class-field records the batch-size. Changing this during training may work. Or:
Pad the batches so that they are 200 sequences long. This will waste time.
Preferably I'd like a way to dynamically manage the batch-sizes.
EDIT: I found a way. It involves simply substituting square-brackets for parentheses:
inference_helper = InferenceHelper(
sample_fn=_sample_fn,
sample_shape=[self.lstm_layersize],
sample_dtype=target_input.dtype,
start_inputs=tf.zeros([batchsize, self.lstm_layersize]),
end_fn=_end_fn
)
Related
I'm trying to fine tune a GPT2-based model on my data using the run_clm.py example script from HuggingFace.
I have a .json data file that looks like this:
...
{"text": "some text"}
{"text": "more text"}
...
I had to change the default behavior of the script that used to concatenate input text, because all my examples are separate demonstrations that should not be concatenated:
def add_labels(example):
example['labels'] = example['input_ids'].copy()
return example
with training_args.main_process_first(desc="grouping texts together"):
lm_datasets = tokenized_datasets.map(
add_labels,
batched=False,
# batch_size=1,
num_proc=data_args.preprocessing_num_workers,
load_from_cache_file=not data_args.overwrite_cache,
desc=f"Grouping texts in chunks of {block_size}",
)
This essentially only adds the appropriate 'labels' field required by CLM.
However since GPT2 has a 1024-sized context-window, the examples should be padded to that length.
I can achieve this by modifying the tokenization procedure like this:
def tokenize_function(examples):
with CaptureLogger(tok_logger) as cl:
output = tokenizer(
examples[text_column_name], padding='max_length') # added: padding='max_length'
# ...
The training runs correctly.
However, I believe this should not be done by the tokenizer, but by the data collator instead. When I remove padding='max_length' from the tokenizer, I get the following error:
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`labels` in this case) have excessive nesting (inputs type `list` where type `int` is expected).
And also, above that:
Traceback (most recent call last):
File "/home/jan/repos/text2task/.venv/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 716, in convert_to_tensors
tensor = as_tensor(value)
ValueError: expected sequence of length 9 at dim 1 (got 33)
During handling of the above exception, another exception occurred:
To fix this, I have created a data collator that should do the padding:
data_collator = DataCollatorWithPadding(tokenizer, padding='max_length')
This is what is passed to the trainer. However, the above error remains.
What's going on?
I managed to fix the error but I'm really unsure about my solution, details below. Will accept a better answer.
This seems to solve it:
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model, padding=True)
Found in the documentation
It seems like DataCollatorWithPadding doesn't pad the labels?
My problem is about generating an output sequence from an input sequence, so I'm guessing that using DataCollatorForSeq2Seq is what I actually want to do. However, my data does not have separate input and target columns, but a single text column (that contains a string input => target). I'm not really that this collator is intended to be used for GPT2...
Apologies in advance.
I am attempting to recreate this CNN (from the Keras Code Examples), with another dataset.
https://keras.io/examples/vision/image_classification_from_scratch/
The dataset I am using is one for retinal scans, and classifies images on a scale from 0-4. So, it's a multi-label image classification.
The Keras example used is binary classification (cats v dogs), though I would have hoped it wouldn't make much difference (maybe this is a big assumption on my part).
I skipped the 'image augmentation' part of the walkthrough. So, I have not created the
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.1),
]
)
part. So, instead of:
def make_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
# Image augmentation block
x = data_augmentation(inputs)
# Entry block
x = layers.Rescaling(1.0 / 255)(x)
.......
at the beginning of the model, I have:
def make_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
# Image augmentation block
x = keras.Sequential(inputs)
# Entry block
x = layers.Rescaling(1.0 / 255)(x)
.......
However I keep getting different errors no matter how much I try to change things around, such as "TypeError: Keras symbolic inputs/outputs do not implement __len__.", or "ValueError: Exception encountered when calling layer "rescaling_3" (type Rescaling).".
What am I missing here?
The current Keras Captcha OCR model returns a CTC encoded output, which requires decoding after inference.
To decode this, one needs to run a decoding utility function after inference as a separate step.
preds = prediction_model.predict(batch_images)
pred_texts = decode_batch_predictions(preds)
The decoded utility function uses keras.backend.ctc_decode, which in turn uses either a greedy or beam search decoder.
# A utility function to decode the output of the network
def decode_batch_predictions(pred):
input_len = np.ones(pred.shape[0]) * pred.shape[1]
# Use greedy search. For complex tasks, you can use beam search
results = keras.backend.ctc_decode(pred, input_length=input_len, greedy=True)[0][0][
:, :max_length
]
# Iterate over the results and get back the text
output_text = []
for res in results:
res = tf.strings.reduce_join(num_to_char(res)).numpy().decode("utf-8")
output_text.append(res)
return output_text
I would like to train a Captcha OCR model using Keras that returns the CTC decoded as an output, without requiring an additional decoding step after inference.
How would I achieve this?
The most robust way to achieve this is by adding a method which is called as part of the model definition:
def CTCDecoder():
def decoder(y_pred):
input_shape = tf.keras.backend.shape(y_pred)
input_length = tf.ones(shape=input_shape[0]) * tf.keras.backend.cast(
input_shape[1], 'float32')
unpadded = tf.keras.backend.ctc_decode(y_pred, input_length)[0][0]
unpadded_shape = tf.keras.backend.shape(unpadded)
padded = tf.pad(unpadded,
paddings=[[0, 0], [0, input_shape[1] - unpadded_shape[1]]],
constant_values=-1)
return padded
return tf.keras.layers.Lambda(decoder, name='decode')
Then defining the model as follows:
prediction_model = keras.models.Model(inputs=inputs, outputs=CTCDecoder()(model.output))
Credit goes to tulasiram58827.
This implementation supports exporting to TFLite, but only float32. Quantized (int8) TFLite export is still throwing an error, and is an open ticket with TF team.
Your question can be interpreted in two ways. One is: I want a neural network that solves a problem where the CTC decoding step is already inside what the network learned. The other one is that you want to have a Model class that does this CTC decoding inside of it, without using an external, functional function.
I don't know the answer to the first question. And I cannot even tell if it's feasible or not. In any case, sounds like a difficult theoretical problem and if you don't have luck here, you might want to try posting it in datascience.stackexchange.com, which is a more theory-oriented community.
Now, if what you are trying to solve is the second, engineering version of the problem, that's something I can help you with. The solution for that problem is the following:
You need to subclass keras.models.Model with a class with the method you want. I went over the tutorial in the link you posted and came with the following class:
class ModifiedModel(keras.models.Model):
# A utility function to decode the output of the network
def decode_batch_predictions(self, pred):
input_len = np.ones(pred.shape[0]) * pred.shape[1]
# Use greedy search. For complex tasks, you can use beam search
results = keras.backend.ctc_decode(pred, input_length=input_len, greedy=True)[0][0][
:, :max_length
]
# Iterate over the results and get back the text
output_text = []
for res in results:
res = tf.strings.reduce_join(num_to_char(res)).numpy().decode("utf-8")
output_text.append(res)
return output_text
def predict_texts(self, batch_images):
preds = self.predict(batch_images)
return self.decode_batch_predictions(preds)
You can give it the name you want, it's just for illustration purposes.
With this class defined, you would replace the line
# Get the prediction model by extracting layers till the output layer
prediction_model = keras.models.Model(
model.get_layer(name="image").input, model.get_layer(name="dense2").output
)
with
prediction_model = ModifiedModel(
model.get_layer(name="image").input, model.get_layer(name="dense2").output
)
And then you can replace the lines
preds = prediction_model.predict(batch_images)
pred_texts = decode_batch_predictions(preds)
with
pred_texts = prediction_model.predict_texts(batch_images)
I am having trouble with modifying Keras' ImageDataGenerator in a custom way such that I can perform say, SaltAndPepper Noise and Gaussian Blur (which they do not offer). I know this type of question has been asked many times before, and I have read almost every link possible below:
But due to my inability to understand the full source code or the lack thereof of python knowledge; I am struggling to implement these two additional types of augmentation in ImageDataGenerator as a custom one. I very much wish someone could point me in the right direction on how to modify the source code, or any other way.
Use a generator for Keras model.fit_generator
Custom Keras Data Generator with yield
Keras Realtime Augmentation adding Noise and Contrast
Data Augmentation Image Data Generator Keras Semantic Segmentation
https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly
https://github.com/keras-team/keras/issues/3338
https://towardsdatascience.com/image-augmentation-14a0aafd0498
https://towardsdatascience.com/image-augmentation-for-deep-learning-using-keras-and-histogram-equalization-9329f6ae5085
An example of SaltAndPepper noise is as follows and I wish to add more types of augmentations into ImageDataGenerator:
class SaltAndPepperNoise:
def __init__(self, replace_probs=0.1, pepper=0, salt=255, noise_type="RGB"):
"""
It is important to know that the replace_probs here is the
Probability of replacing a "pixel" to salt and pepper noise.
"""
self.replace_probs = replace_probs
self.pepper = pepper
self.salt = salt
self.noise_type = noise_type
def get_aug(self, img, bboxes):
if self.noise_type == "SnP":
random_matrix = np.random.rand(img.shape[0], img.shape[1])
img[random_matrix >= (1 - self.replace_probs)] = self.salt
img[random_matrix <= self.replace_probs] = self.pepper
elif self.noise_type == "RGB":
random_matrix = np.random.rand(img.shape[0], img.shape[1], img.shape[2])
img[random_matrix >= (1 - self.replace_probs)] = self.salt
img[random_matrix <= self.replace_probs] = self.pepper
return img, bboxes
I want to do a similar thing in my code. I am reading the documentation here. See the parameter preprocessing_function. You can implement a function and then you can pass it to this parameter to ImageDataGenerator.
I edit my answer to show you a practical example:
def my_func(img):
return img/255
train_datagen = ImageDataGenerator(preprocessing_function =my_func)
Here I just implement a short function that rescales your data, but you can implement noises and so on.
I would like to know if Keras can be used as an interface to TensoFlow for only doing computation on my GPU.
I tested TF directly on my GPU. But for ML purposes, I started using Keras, including the backend. I would find it 'comfortable' to do all my stuff in Keras instead of Using two tools.
This is also a matter of curiosity.
I found some examples like this one:
http://christopher5106.github.io/deep/learning/2018/10/28/understand-batch-matrix-multiplication.html
However this example does not actually do the calculation.
It also does not get input data.
I duplicate the snippet here:
'''
from keras import backend as K
a = K.ones((3,4))
b = K.ones((4,5))
c = K.dot(a, b)
print(c.shape)
'''
I would simply like to know if I can get the result numbers from this snippet above, and how?
Thanks,
Michel
Keras doesn't have an eager mode like Tensorflow, and it depends on models or functions with "placeholders" to receive and output data.
So, it's a little more complicated than Tensorflow to do basic calculations like this.
So, the most user friendly solution would be creating a dummy model with one Lambda layer. (And be careful with the first dimension that Keras will insist to understand as a batch dimension and require that input and output have the same batch size)
def your_function_here(inputs):
#if you have more than one tensor for the inputs, it's a list:
input1, input2, input3 = inputs
#if you don't have a batch, you should probably have a first dimension = 1 and get
input1 = input1[0]
#do your calculations here
#if you used the batch_size=1 workaround as above, add this dimension again:
output = K.expand_dims(output,0)
return output
Create your model:
inputs = Input(input_shape)
#maybe inputs2 ....
outputs = Lambda(your_function_here)(list_of_inputs)
#maybe outputs2
model = Model(inputs, outputs)
And use it to predict the result:
print(model.predict(input_data))