tensorflow only predicts '[UNK]' characters - python-3.x

I am trying to generate text in a certain style using tensorflow, and even when I copy and paste the code from the tensorflow website it only predicts unknown characters even though there is a mask to prevent this. I'm thinking it has something to do with the version of python I'm running (3.9.15) since it doesnt even work with their code and dataset.
Has anyone else run into this issue?
class OneStep(tf.keras.Model):
def __init__(self, model, chars_from_ids, ids_from_chars, temperature=1.0):
self.temperature = temperature
self.model = model
self.chars_from_ids = chars_from_ids
self.ids_from_chars = ids_from_chars
# Create a mask to prevent "[UNK]" from being generated.
skip_ids = self.ids_from_chars(['[UNK]'])[:, None]
sparse_mask = tf.SparseTensor(
# Put a -inf at each bad index.
# Match the shape to the vocabulary
self.prediction_mask = tf.sparse.to_dense(sparse_mask)
def generate_one_step(self, inputs, states=None):
# Convert strings to token IDs.
input_chars = tf.strings.unicode_split(inputs, 'UTF-8')
input_ids = self.ids_from_chars(input_chars).to_tensor()
# Run the model.
# predicted_logits.shape is [batch, char, next_char_logits]
predicted_logits, states = self.model(inputs=input_ids, states=states,
# Only use the last prediction.
predicted_logits = predicted_logits[:, -1, :]
predicted_logits = predicted_logits/self.temperature
# Apply the prediction mask: prevent "[UNK]" from being generated.
predicted_logits = predicted_logits + self.prediction_mask
# Sample the output logits to generate token IDs.
predicted_ids = tf.random.categorical(predicted_logits, num_samples=1)
predicted_ids = tf.squeeze(predicted_ids, axis=-1)
# Convert from token ids to characters
predicted_chars = self.chars_from_ids(predicted_ids)
# Return the characters and model state.
return predicted_chars, states
I lifted this straight from their tutorial and tried to run it in my enviroment and it just predicted '[UNK]'
I'm running a mac M1 with the latest version of tensorflow. So, that may also be an issue.
Tutorial for reference: https://www.tensorflow.org/text/tutorials/text_generation


Extract the features of last layer from the pytorch-fasterrcnn-resnet50-fpn

I have a image where i have to use the pytorch-fasterrcnn-resnet50-fpn to extract the features of the image. Below is the code that I am trying
import torchvision
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained = True)
### strip the last layer
feature_extractor = torch.nn.Sequential(*list(model_ft.children())[:-1])
inputs = feature_extractor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state
Here the type of image is PIL.JpegImagePlugin.JpegImageFile
The above code is not working for getting the features. Can anyone tell me how to solve this?

Tensorflow : How to retrieve parts of my multi input Dataset and their respective loss?

First of all i am quite new regarding how AI and Tensorflow work.
My problem is the following : I need to train my neural network on 2 paired images. One that is unchanged and the same one that is transformed. This implies at the end a joint loss calculation of the paired images in order to calculate the mutual information for an unsupervised image analysis problem.
Also, since my dataset are 256*256 RGB images * 4 000 i need to use a data generator.
Here is an example of what i already did about my data generator:
class dataset(object):
def __init__(self, data_list, batch_size):
self.dataset = None
self.batch_size = BATCH_SIZE
self.current_batch = 0
self.data_list = data_list
self.normal_image = None
self.transformed_image = None
self.label = None
def generator(self):
index = self.current_batch * self.batch_size
self.current_batch = self.current_batch + 1
for image, label in self.data_list[index:]:
self.label = label
image = image / 255.0
self.normal_image = image
self.transformed_image = utils.get_random_crop(image, height = 200, width = 200)
yield ({'normal_image' : self.normal_image,
'transformed_image' : self.transformed_image},
{'label' : self.label})
def data_loader(self):
self.dataset = tf.data.Dataset.from_generator(self.generator,
{'normal_image' : tf.float32,
'transformed_image' : tf.float32},
{'label' : tf.int32})).batch(self.batch_size)
return self.dataset
train_dataset = dataset(train_list, BATCH_SIZE)
test_dataset = dataset(test_list, BATCH_SIZE)
Note that train_list & test_list are just raw numpy arrays that i have retrieved from my images collection.
Here are my 2 questions :
How can i retrieve specifically the loss from my normal & transformed images so that i can do a joint loss calculation at the end of each epoch ?
I got my data generator(seems to work fine) each next() retrieve the next batch of my collection. However as you can see i have a (kind of ?) tuple inside of my dataset {normal_image, transformed_image}.
I am having a hard time to find how to access specifically one of those data inside of this (kind of ?) tuple in order to feed my CNN with the normal_imageand the transformed_image one at the time ect...
dataset.transformed_image would have been too good Haha !
Also, in my dataset class i have a self.normal_image & self.transformed_image but i use them only for plotting. They are not tensors... like in my dataset :(
Thanks for your time !

Find wrongly categorized samples from validation step

I am using a keras neural net for identifying category in which the data belongs.
optimizer=keras.optimizers.Adam(lr=0.001, decay=0.0001),
Fit function
history = self.model.fit(self.X,
{'output': self.Y},
I am interested in finding out which labels are getting categorized wrongly in the validation step. Seems like a good way to understand what is happening under the hood.
You can use model.predict_classes(validation_data) to get the predicted classes for your validation data, and compare these predictions with the actual labels to find out where the model was wrong. Something like this:
predictions = model.predict_classes(validation_data)
wrong = np.where(predictions != Y_validation)
If you are interested in looking 'under the hood', I'd suggest to use
to see the scores for each class, for each observation of the validation set.
This should shed some light on which categories the model is not so good at classifying. The way to predict the final class is
scores = model.predict(validation_data_x)
preds = np.argmax(scores, axis=1)
be sure to use the proper axis for np.argmax (I'm assuming your observation axis is 1). Use preds to then compare with the real class.
Also, as another exploration you want to see the overall accuracy on this dataset, use
model.evaluate(x=validation_data_x, y=validation_data_y)
I ended up creating a metric which prints the "worst performing category id + score" on each iteration. Ideas from link
import tensorflow as tf
import numpy as np
class MaxIoU(object):
def __init__(self, num_classes):
self.num_classes = num_classes
def max_iou(self, y_true, y_pred):
# Wraps np_max_iou method and uses it as a TensorFlow op.
# Takes numpy arrays as its arguments and returns numpy arrays as
# its outputs.
return tf.py_func(self.np_max_iou, [y_true, y_pred], tf.float32)
def np_max_iou(self, y_true, y_pred):
# Compute the confusion matrix to get the number of true positives,
# false positives, and false negatives
# Convert predictions and target from categorical to integer format
target = np.argmax(y_true, axis=-1).ravel()
predicted = np.argmax(y_pred, axis=-1).ravel()
# Trick from torchnet for bincounting 2 arrays together
# https://github.com/pytorch/tnt/blob/master/torchnet/meter/confusionmeter.py
x = predicted + self.num_classes * target
bincount_2d = np.bincount(x.astype(np.int32), minlength=self.num_classes**2)
assert bincount_2d.size == self.num_classes**2
conf = bincount_2d.reshape((self.num_classes, self.num_classes))
# Compute the IoU and mean IoU from the confusion matrix
true_positive = np.diag(conf)
false_positive = np.sum(conf, 0) - true_positive
false_negative = np.sum(conf, 1) - true_positive
# Just in case we get a division by 0, ignore/hide the error and set the value to 0
with np.errstate(divide='ignore', invalid='ignore'):
iou = false_positive / (true_positive + false_positive + false_negative)
iou[np.isnan(iou)] = 0
return np.max(iou).astype(np.float32) + np.argmax(iou).astype(np.float32)
custom_metric = MaxIoU(len(catagories))
optimizer=keras.optimizers.Adam(lr=0.001, decay=0.0001),
metrics=[categorical_accuracy, custom_metric.max_iou])

How to correctly encode labels with tensorflow's one-hot encoding?

I've been trying to learn Tensorflow with python 3.6 and decided on building a facial recognition program using data from the University of Essex's face data base (http://cswww.essex.ac.uk/mv/allfaces/index.html). So far I've been following Tensorflow's MNIST Expert guide, but when I start testing, my accuracy is 0 for every epoch, so I know something is wrong. I feel most shaky on how I'm handling the labels, so I figure that's where the problem is.
The labels in the dataset are either numeric IDs, like 987323, or someone's name, like "fordj". My idea to deal with this was to create a "pre-encoding" encode_labels function, which gives each unique label in the test and training sets their own unique integer value. I checked to make sure each unique label in the test and train sets have the same unique value. It also returns a dictionary so that I can easily map back to the original label from the encoded version. If I don't do this step and pass the labels as I retrieve them (i.e "fordj"), I get an error saying
UnimplementedError (see above for traceback): Cast string to int32 is not supported
[[Node: Cast = CastDstT=DT_INT32, SrcT=DT_STRING, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
The way I'm interpreting this is that since many of the labels are people's names, tensorflow can't convert a label like "fordj" to a tf.int32. The code to grab labels and paths is here:
def get_paths_and_labels(path):
""" image_paths : list of relative image paths
labels : mix of alphanumeric characters """
image_paths = [path + image for image in os.listdir(path)]
labels = [i.split(".")[-3] for i in image_paths]
labels = [i.split("/")[-1] for i in labels]
return image_paths, labels
def encode_labels(train_labels, test_labels):
""" Assigns a numeric value to each label since some are subject's names """
found_labels = []
index = 0
mapping = {}
for i in train_labels:
if i in found_labels:
mapping[i] = index
index += 1
return [mapping[i] for i in train_labels], [mapping[i] for i in test_labels], mapping
Here is how I assign my training and testing labels. I then want to use tensorflow's one-hot encoder to encode them again for me.
def main():
# Grabs the labels and each image's relative path
train_image_paths, train_labels = get_paths_and_labels(TRAIN_PATH)
# Smallish dataset so I can read it all into memory
train_images = [cv2.imread(image) for image in train_image_paths]
test_image_paths, test_labels = get_paths_and_labels(TEST_PATH)
test_images = [cv2.imread(image) for image in test_image_paths]
num_classes = len(set(train_labels))
# Placeholders
x = tf.placeholder(tf.float32, shape=[None, IMAGE_SIZE[0] * IMAGE_SIZE[1]])
y_ = tf.placeholder(tf.float32, shape=[None, num_classes])
x_image = tf.reshape(x, [-1, IMAGE_SIZE[0], IMAGE_SIZE[1], 1])
# One-hot labels
train_labels, test_labels, mapping = encode_labels(train_labels, test_labels)
train_labels = tf.one_hot(indices=tf.cast(train_labels, tf.int32), depth=num_classes)
test_labels = tf.one_hot(indices=tf.cast(test_labels, tf.int32), depth=num_classes)
I'm sure I'm doing something wrong. I know sklearn has a LabelEncoder, though I haven't tried it out yet. Thanks for any advice on this, all help is appreciated!
The way I'm interpreting this is that since many of the labels are people's names, tensorflow can't convert a label like "fordj" to a tf.int32.
You're right. Tensorflow can't do that. Instead, you can create a mapping function from a nome to a unique (and progressive) ID. Once you did that, you can correctly one-encode every numeric ID with its one-hot representation.
You already have the relation between the numeric ID and the string label, hence you can do something like:
train_labels, test_labels, mapping = encode_labels(train_labels, test_labels)
numeric_train_ids = [labels[idx] for idx in train_labels]
numeric_test_ids = [labels[idx] for idx in test_labels]
one_hot_train_labels = tf.one_hot(indices=numeric_train_ids, depth=num_classes)
one_hot_test_labels = tf.one_hot(indices=numeric_test_ids, depth=num_classes)

Get gradient value necessary to break an image

I've been experimenting with adversarial images and I read up on the fast gradient sign method from the following link https://arxiv.org/pdf/1412.6572.pdf...
The instructions explain that the necessary gradient can be calculated using backpropagation...
I've been successful at generating adversarial images but I have failed at attempting to extract the gradient necessary to create an adversarial image. I will demonstrate what I mean.
Let us assume that I have already trained my algorithm using logistic regression. I restore the model and I extract the number I wish to change into a adversarial image. In this case it is the number 2...
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits)
# assign the images of number 2 to the variable
sess.run(tf.assign(x, labels_of_2))
# setup softmax
# placeholder for target label
fake_label = tf.placeholder(tf.int32, shape=[1])
# setup the fake loss
fake_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=fake_label)
# minimize fake loss using gradient descent,
# calculating the derivatives of the weight of the fake image will give the direction of weights necessary to change the prediction
adversarial_step = tf.train.GradientDescentOptimizer(learning_rate=FLAGS.learning_rate).minimize(fake_loss, var_list=[x])
# continue calculating the derivative until the prediction changes for all 10 images
for i in range(FLAGS.training_epochs):
# fake label tells the training algorithm to use the weights calculated for number 6
sess.run(adversarial_step, feed_dict={fake_label:np.array([6])})
This is my approach, and it works perfectly. It takes my image of number 2 and changes it only slightly so that when I run the following...
x_in = np.expand_dims(x[0], axis=0)
classification = sess.run(tf.argmax(pred, 1))
it will predict the number 2 as a number 6.
The issue is, I need to extract the gradient necessary to trick the neural network into thinking number 2 is 6. I need to use this gradient to create the nematode mentioned above.
I am not sure how can I extract the gradient value. I tried looking at tf.gradients but I was unable to figure out how to produce an adversarial image using this function. I implemented the following after the fake_loss variable above...
tf.gradients(fake_loss, x)
for i in range(FLAGS.training_epochs):
# calculate gradient with weight of number 6
gradient_value = sess.run(gradients, feed_dict={fake_label:np.array([6])})
# update the image of number 2
gradient_update = x+0.007*gradient_value[0]
sess.run(tf.assign(x, gradient_update))
Unfortunately the prediction did not change in the way I wanted, and moreover this logic resulted in a rather blurry image.
I would appreciate an explanation as to what I need to do in order calculate and extract the gradient that will trick the neural network, so that if I were to take this gradient and apply it to my image as a nematode, it will result in a different prediction.
Why not let the Tensorflow optimizer add the gradients to your image? You can still evaluate the nematode to get the resulting gradients that were added.
I created a bit of sample code to demonstrate this with a panda image. It uses the VGG16 neural network to transform your own panda image into a "goldfish" image. Every 100 iterations it saves the image as PDF so you can print it losslessly to check if your image is still a goldfish.
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipyd
from libs import vgg16 # Download here! https://github.com/pkmital/CADL/tree/master/session-4/libs
pandaimage = plt.imread('panda.jpg')
pandaimage = vgg16.preprocess(pandaimage)
img_4d = np.array([pandaimage])
g = tf.get_default_graph()
input_placeholder = tf.Variable(img_4d,trainable=False)
to_add_image = tf.Variable(tf.random_normal([224,224,3], mean=0.0, stddev=0.1, dtype=tf.float32))
combined_images_not_clamped = input_placeholder+to_add_image
filledmax = tf.fill(tf.shape(combined_images_not_clamped), 1.0)
filledmin = tf.fill(tf.shape(combined_images_not_clamped), 0.0)
greater_than_one = tf.greater(combined_images_not_clamped, filledmax)
combined_images_with_max = tf.where(greater_than_one, filledmax, combined_images_not_clamped)
lower_than_zero =tf.less(combined_images_with_max, filledmin)
combined_images = tf.where(lower_than_zero, filledmin, combined_images_with_max)
net = vgg16.get_vgg_model()
tf.import_graph_def(net['graph_def'], name='vgg')
names = [op.name for op in g.get_operations()]
style_layer = 'prob:0'
the_prediction = tf.import_graph_def(
input_map={'images:0': combined_images},return_elements=[style_layer])
goldfish_expected_np = np.zeros(1000)
goldfish_expected_tf = tf.Variable(goldfish_expected_np,dtype=tf.float32,trainable=False)
loss = tf.reduce_sum(tf.square(the_prediction[0]-goldfish_expected_tf))
optimizer = tf.train.AdamOptimizer().minimize(loss)
sess = tf.InteractiveSession()
def show_many_images(*images):
fig = plt.figure()
for i in range(len(images)):
subplot_number = 100+10*len(images)+(i+1)
for i in range(1000):
_, loss_val = sess.run([optimizer,loss])
if i%100==1:
print("Loss at iteration %d: %f" % (i,loss_val))
_, loss_val,adversarial_image,pred,nematode = sess.run([optimizer,loss,combined_images,the_prediction,to_add_image])
res = np.squeeze(pred)
average = np.mean(res, 0)
res = res / np.sum(average)
print([(res[idx], net['labels'][idx]) for idx in res.argsort()[-5:][::-1]])
plt.imsave('adversarial_goldfish.pdf',adversarial_image[0],format='pdf') # save for printing
Let me know if this helps you!
