I have tried the code which is attached in the screen shot below. But I am not able to do it. How can i create Dataloader for cycle gans with this input data
Related
everyone.
I am trying to create a CNN which can upon being fed input of images classify which part of the image to focus upon. For that purpose, I have collected data by obtaining gaze data of humans for a given video and divided each video frame into 9 different areas. With the actual gaze data acting as the supervisory data, I am trying to make my system learn how to mimic a human's eye gaze.
For starters, I am using a pre-built CNN for the classification of the MNIST dataset using tensorflow. I am currently trying to make my dataset follow the format of MNIST dataset keras.datasets.mnist. I have video frames in .jpg format and the corresponding grid area as a NumPy array.
I am stuck on how to correctly label and format my images so that I can directly feed the image into the pre-built CNN. System I am using tensorflow 2.7.0, python 3.9.7 using conda.
Any help is very appreciated.
I hope to know how to build up the dataset with multiple NumPy arrays as a data channel.
I have multiple arrays for 4-channel data as input and 1-channel data as an output.
For instance,
Example of Data Structure - Tensor Form
I think this form of data can be built with the below function,
from torch.utils.data.dataset import Dataset
But I am still having some issues to assign the tensor properly, specify the input, and output.
I hope to get some ideas or guides for this problem.
Thank you so much!
I am new to Keras and am trying to do data augmentation but I am stuck at the start itself.
I am having an image and I am trying to make augmentations of it as follows;
datagen=image.ImageDataGenerator(rotation_range=20)
iter=datagen.flow(samples,batch_size=2)
batch=iter.next()
plt.imshow(batch[0].astype('uint8'))
So I understand that datagen is a generator and iter is an iterator to iterate upon it but my doubt is regarding the batch_size. Here, the batch_size=2, it means during each iteration, a batch of 2 images is created. Now I am able to see the first image in the batch using batch[0] as shown above but not able to see the second image of the batch using batch[1]. When I check the batch.shape, it displays (1,399,640,3)which means there is only one image in the batch. I am not able to understand it. Where is the second image? How can I display the second image of the batch.
I am trying to approach a multi label image classification problem,for which i have image data but i also have some other features like gender etc, but the issue is that i will get this information during testing, in other words during testing only the image information will be provided.
My question is how can i use these extra features to help my image model which is a convolution Neural Network even though i wont have this info during testing?
Any advice will be helpful.Thanks in advance.
This is a really open ended question. I can give you some general guidelines on how this can work.
keras model API supports multiple inputs as well as merge layers. For example you can have something like this:
from keras.layers import Input
from keras.models import Model
image = Input(...)
text = Input(...)
... # apply layers onto image and text
from keras.layers.merge import Concatenate
combined = Concatenate()([image, text])
... # apply layers onto combined
model = Model([image, text], [combined])
This way you can have a model that takes multiple inputs that can make use of all of your data sources. keras has tools to combine your different inputs to produce one output. The part where this becomes open ended is the architecture.
Right now you should probably pass image through a CNN, and then merge the output with text. You have to tweak the exact specifications, such as how you handle each input, your merge method, and how you handle the combined output.
A good example of merge being used is here, where a GAN is given latent noise in the form of an image but also a label to determine what kind of image it should generate. Both the discriminator and the generator make use of the multiply merge layer to combine their inputs.
I have made a convolutional neural network to mnist data. Now I want to change the input to my image. How can I do it? need to save the picture in a specific format?In addition, how save all picture and train one after the other?I use in tensorflow with python.
Tensorflow has support for bmp, gif, jpeg and png out of the box.
So load the data (read the file into memory as a 0D tensor of type string) then pass it to tf.image.decode_image or one of the specialized functions if it doesn't work for some reason.
You should get back the image as a tensor of shape [width, height, channels] (channels might be missing if you only have a single channel image, like grayscale).
To make this work nice you should have all the images in the same format. If you can load all the images into ram and pass them in bulk go for it since it's probably the easiest thing to do. Next easiest thing would be to copy the images into tensorflow.Example and to tf.TFRecordReader to do the shuffling and batching. If all else fails I think you can setup the input functions to read the images on demand and pipe them through the batching mechanism but I'm not sure how I would do that.
Here's a link to the tensorflow documentation related to images.