Train your own image with tensorflow? - python-3.x

I have one image ( i don't have dataset ) I want to train a model in tensorflow,
such that I can use that model to recognize the image fast.
I have implemented one such thing, but it doesn't work:
import tensorflow as tf
filenames = ['pic.jpg']
# step 2
filename_queue = tf.train.string_input_producer(filenames)
# step 3: read, decode and resize images
reader = tf.WholeFileReader()
filename, content = reader.read(filename_queue)
image = tf.image.decode_jpeg(content, channels=3)
image = tf.cast(image, tf.float32)
resized_image = tf.image.resize_images(image, [224, 224])
# step 4: Batching
image_batch = tf.train.batch([resized_image], batch_size=8)
Also, how vuforia is able to recognize with only one image so fast?. I want a similar implementation in tensorflow

This is not how machine learning and deep learning works. You can't just grab one element and build a model which explains this one element. If you will check a few NN tutorials, you will see that in order to train a reasonable model people use thousands or even millions of data points.

Related

Load several Images without label in keras cnn

I have several .jpeg images with different names, that I want to load into a cnn in a jupyter notebook to have them classified. The only way I found was:
test_image = image.load_img("name_of_picute.jpeg",target_size=(64,64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis=0)
result = cnn.predict(test_image)
All the other things found at the Keras API like tf.keras.preprocessing.image_dataset_from_directory()seems to only work on labeled data. Sadly I can't "simply" iterate over the name of the pictures a they are named differently, is there a way to predict all of them at once without naming every single picture?
Thanks for yout help,
Nick
The solutiontf.keras.preprocessing.image_dataset_from_directory can be updated to return the dataset and the image_path as explained here -> https://stackoverflow.com/a/63725072/4994352
There are multiple ways, for larger data it is useful to use a tf.data.DataSet as it can be tweaked for performance quite easily. I will give you the non-performance-optimized code. Replace <YOUR PATH INCL. REGEX> with the path like ../input/pokemon-images-and-types/images/*/*.
import tensorflow as tf
from tensorflow.data.experimental import AUTOTUNE
def load(file_path):
img = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img, channels=3)
... # do some preprocessing like resizing if necessary
return img
list_ds = tf.data.Dataset.list_files(str('<YOUR PATH INCL. REGEX>'), shuffle=True) # Get all images from subfolders
train_dataset = list_ds.take(-1)
# Set `num_parallel_calls` so multiple images are loaded/processed in parallel.
train_dataset = train_dataset.map(load, num_parallel_calls=AUTOTUNE)

K-fold crossvalidation on images

Let's say I have some pictures divided in 3 categories ("cat", "dog", "mouse") and my DL net is written in keras.
The design I used is the same as in this picture (1):
I splitted the data into three different folders: training, validation and test.
The net should be able to recognize a cat, dog or a mouse given a picture. The accuracy I get is around 98%.
It works.
But I need for some reasons change that design. I would like to use the K-fold cross-validation process and the schema should now look like (2):
Now my problem is that I don't know how to split and distribute the original data according to the schema in Fig. 2.
I can only imagine 2 different ways. Let's forget the test directory for the moment:
I create 2 folders: "Training" and "Validation". In both is the structure the same as in Fig. 1: Three subdirectory for every categories. Now the problem is: should I move the data around when progressing from Fold 1 to Fold 3? Or I can allocate once the images into the subdirectories?
I create 2 folders: "Training" and "Validation", BUT I mix all images togheter. No subdirectory. In this case I have the problem that I lose the connection between the picture name and the pet on it. How can I tell Keras, which animal should be identified?
Personally I would mix all images togheter, no matter what they show. But I would save the information of the content into a file. In this case I pass to Keras the directory (Validation or Training) and a file containing the name of all files and their content.
What would you suggest?
Ok, I can answer my own question.
The easiest way is just to user Kfold form sklearn in the python script
from sklearn.model_selection import KFold
After that you need to istantiate KFold
kfold = KFold(n_splits = 4, shuffle = True)
and you iterate over the splitted dataset like:
datagen = ImageDataGenerator(rescale = 1. / 255.)
for train, test in kfold.split(df_data):
# df is the whole dataset (all together!)
df_train = df.iloc[train, :] # Look that train is coming from the for in .. loop
df_test = df.iloc[test, :] # The same for test
train_generator = datagen.flow_from_dataframe(dataframe = df_train,
directory = dataset_dir,
... )
test_generator = datagen.flow_from_dataframe(dataframe = df_test,
directory = dataset_dir,
...)
model = models.Sequential()
.....
model.compile(...)
model.fit(...)
and it is done! The dataset is now splitted in partitions!!!
Note that the class ImageDataGenerator is not in the for loop!!!
And note please, that the methods (model creation, compile() and fit()) must be in the for-loop.
The code above works for me very good.

Multiprocessing ResNet Feature Extraction with Image URLs

I have a simple function to take an image url and extract features from it using resnet in Keras then hand it off to an Xgboost model loaded from a pkl file.
def classify(img, resnet_model, loaded_model):
try:
images = io.imread(img.strip())
images = Image.fromarray(images)
test_image = img.resize((224,224))
test_image = image.img_to_array(test_image)
test_image = expand_dims(test_image, axis = 0)
img_data = preprocess_input(test_image)
image_features = resnet_model.predict(img_data)
image_features_array = array(image_features)
predicted_image = loaded_model.predict(xgb.DMatrix(DataFrame(image_features_array)))
except:
predicted_image = 'Broken URL'
return predicted_image`
Currently I am just looping through a list of img urls and it works fine but I will need it to perform much faster. The code itself may not be the most efficient yet but I am mostly concerned with multiprocessing. My attempts either hang or immediately result in an empty list.
There is a similar question posted years ago here: question but the answers were not very satisfying and involved holding a batch of image files locally. I would prefer to just have one worker making the request for the image and then predicting that image.

Capture output of ImageDataGenerator without saving images to drive?

I'm using image augmentation to train my model, applying transforms like brightness and color shifts. I like to preview the effects of the augmentations before putting them into use. Normally, I do it like this:
from keras_preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(horizontal_flip = True,
fill_mode = "nearest",
zoom_range = 0.3,
rotation_range=360)
i = 0
for batch in datagen.flow_from_directory(directory='./my_images/,
batch_size = 1,
save_to_dir='.',
save_prefix='aug',
save_format='jpeg'):
i += 1
if i > 2:
break # Yields two images
Then I open the images and look at them, or read them in and print them to my notebook. But I don't like this- it's clunky. Is there a way to directly capture the altered images produced by my generator? I'd like to add them to an array.
Thanks!

Get gradient value necessary to break an image

I've been experimenting with adversarial images and I read up on the fast gradient sign method from the following link https://arxiv.org/pdf/1412.6572.pdf...
The instructions explain that the necessary gradient can be calculated using backpropagation...
I've been successful at generating adversarial images but I have failed at attempting to extract the gradient necessary to create an adversarial image. I will demonstrate what I mean.
Let us assume that I have already trained my algorithm using logistic regression. I restore the model and I extract the number I wish to change into a adversarial image. In this case it is the number 2...
# construct model
logits = tf.matmul(x, W) + b
pred = tf.nn.softmax(logits)
...
...
# assign the images of number 2 to the variable
sess.run(tf.assign(x, labels_of_2))
# setup softmax
sess.run(pred)
# placeholder for target label
fake_label = tf.placeholder(tf.int32, shape=[1])
# setup the fake loss
fake_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,labels=fake_label)
# minimize fake loss using gradient descent,
# calculating the derivatives of the weight of the fake image will give the direction of weights necessary to change the prediction
adversarial_step = tf.train.GradientDescentOptimizer(learning_rate=FLAGS.learning_rate).minimize(fake_loss, var_list=[x])
# continue calculating the derivative until the prediction changes for all 10 images
for i in range(FLAGS.training_epochs):
# fake label tells the training algorithm to use the weights calculated for number 6
sess.run(adversarial_step, feed_dict={fake_label:np.array([6])})
sess.run(pred)
This is my approach, and it works perfectly. It takes my image of number 2 and changes it only slightly so that when I run the following...
x_in = np.expand_dims(x[0], axis=0)
classification = sess.run(tf.argmax(pred, 1))
print(classification)
it will predict the number 2 as a number 6.
The issue is, I need to extract the gradient necessary to trick the neural network into thinking number 2 is 6. I need to use this gradient to create the nematode mentioned above.
I am not sure how can I extract the gradient value. I tried looking at tf.gradients but I was unable to figure out how to produce an adversarial image using this function. I implemented the following after the fake_loss variable above...
tf.gradients(fake_loss, x)
for i in range(FLAGS.training_epochs):
# calculate gradient with weight of number 6
gradient_value = sess.run(gradients, feed_dict={fake_label:np.array([6])})
# update the image of number 2
gradient_update = x+0.007*gradient_value[0]
sess.run(tf.assign(x, gradient_update))
sess.run(pred)
Unfortunately the prediction did not change in the way I wanted, and moreover this logic resulted in a rather blurry image.
I would appreciate an explanation as to what I need to do in order calculate and extract the gradient that will trick the neural network, so that if I were to take this gradient and apply it to my image as a nematode, it will result in a different prediction.
Why not let the Tensorflow optimizer add the gradients to your image? You can still evaluate the nematode to get the resulting gradients that were added.
I created a bit of sample code to demonstrate this with a panda image. It uses the VGG16 neural network to transform your own panda image into a "goldfish" image. Every 100 iterations it saves the image as PDF so you can print it losslessly to check if your image is still a goldfish.
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import IPython.display as ipyd
from libs import vgg16 # Download here! https://github.com/pkmital/CADL/tree/master/session-4/libs
pandaimage = plt.imread('panda.jpg')
pandaimage = vgg16.preprocess(pandaimage)
plt.imshow(pandaimage)
img_4d = np.array([pandaimage])
g = tf.get_default_graph()
input_placeholder = tf.Variable(img_4d,trainable=False)
to_add_image = tf.Variable(tf.random_normal([224,224,3], mean=0.0, stddev=0.1, dtype=tf.float32))
combined_images_not_clamped = input_placeholder+to_add_image
filledmax = tf.fill(tf.shape(combined_images_not_clamped), 1.0)
filledmin = tf.fill(tf.shape(combined_images_not_clamped), 0.0)
greater_than_one = tf.greater(combined_images_not_clamped, filledmax)
combined_images_with_max = tf.where(greater_than_one, filledmax, combined_images_not_clamped)
lower_than_zero =tf.less(combined_images_with_max, filledmin)
combined_images = tf.where(lower_than_zero, filledmin, combined_images_with_max)
net = vgg16.get_vgg_model()
tf.import_graph_def(net['graph_def'], name='vgg')
names = [op.name for op in g.get_operations()]
style_layer = 'prob:0'
the_prediction = tf.import_graph_def(
net['graph_def'],
name='vgg',
input_map={'images:0': combined_images},return_elements=[style_layer])
goldfish_expected_np = np.zeros(1000)
goldfish_expected_np[1]=1.0
goldfish_expected_tf = tf.Variable(goldfish_expected_np,dtype=tf.float32,trainable=False)
loss = tf.reduce_sum(tf.square(the_prediction[0]-goldfish_expected_tf))
optimizer = tf.train.AdamOptimizer().minimize(loss)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
def show_many_images(*images):
fig = plt.figure()
for i in range(len(images)):
print(images[i].shape)
subplot_number = 100+10*len(images)+(i+1)
plt.subplot(subplot_number)
plt.imshow(images[i])
plt.show()
for i in range(1000):
_, loss_val = sess.run([optimizer,loss])
if i%100==1:
print("Loss at iteration %d: %f" % (i,loss_val))
_, loss_val,adversarial_image,pred,nematode = sess.run([optimizer,loss,combined_images,the_prediction,to_add_image])
res = np.squeeze(pred)
average = np.mean(res, 0)
res = res / np.sum(average)
plt.imshow(adversarial_image[0])
plt.show()
print([(res[idx], net['labels'][idx]) for idx in res.argsort()[-5:][::-1]])
show_many_images(img_4d[0],nematode,adversarial_image[0])
plt.imsave('adversarial_goldfish.pdf',adversarial_image[0],format='pdf') # save for printing
Let me know if this helps you!

Resources