everyone.
I am trying to create a CNN which can upon being fed input of images classify which part of the image to focus upon. For that purpose, I have collected data by obtaining gaze data of humans for a given video and divided each video frame into 9 different areas. With the actual gaze data acting as the supervisory data, I am trying to make my system learn how to mimic a human's eye gaze.
For starters, I am using a pre-built CNN for the classification of the MNIST dataset using tensorflow. I am currently trying to make my dataset follow the format of MNIST dataset keras.datasets.mnist. I have video frames in .jpg format and the corresponding grid area as a NumPy array.
I am stuck on how to correctly label and format my images so that I can directly feed the image into the pre-built CNN. System I am using tensorflow 2.7.0, python 3.9.7 using conda.
Any help is very appreciated.
I am trying to balance my image dataset using WeightedRandomSampler but after loading data using Dataloader, I am unable to split the dataset to train and test. Could anyone please guide me in this regard?
You should split your Dataset (e.g., using data.random_split) not your DataLoader. The split should be agnostic to the way you sample/process your training data. Only after you have a training split of the data you can apply WeightedRandomSampler to it.
from tensorflow.keras.preprocessing.image import ImageDataGenerator
With image data generator's flow_from_directory method can we reshape images also.
e.g. we have color images in 10 classes in 10 folders and we are providing path of that directory let's say train:
gen = ImageDataGenerator(rescale=1./255, width_shift_range=0.05, height_shift_range=0.05)
train_imgs= gen .flow_from_directory(
'/content/data/train',
target_size=(10,10),
batch_size=1,
class_mode='categorical')
Now my model is taking input shape 300. And I want to define training data from this train_imgs that is images of 10X10X3.
Is there any library, method or option available to convert this data generator to matrix in which columns are each image vector?
Generally the best option in these cases is to add a Reshape layer to the start of your model: layers.Reshape((300), input_shape=(10,10,3)). You can also do layers.Reshape((-1), input_shape=(10,10,3)), and it will automatically figure out the correct output length.
I have a dataset with 3 tensor outputs of data, label and path:
import tensorflow as tf #tensroflow version 2.1
data=tf.constant([[0,1],[1,2],[2,3],[3,4],[4,5],[5,6],[6,7],[7,8],[8,9],[9,0]],name='data')
labels=tf.constant([0,1,0,1,0,1,0,1,0,1],name='label')
path=tf.constant(['p0','p1','p2','p3','p4','p5','p6','p7','p8','p9'],name='path')
my_dataset=tf.data.Dataset.from_tensor_slices((data,labels,path))
I want to separate my_dataset back to 3 datasets of data, labels and paths (or 3 tensors) without iterating over it and without converting it to numpy.
In tensorflow 1.X this is done simply using
d,l,p=my_dataset.make_one_shot_iterator().get_next()
and then converting the tensors to datasets. How to do it in tensorflow2?
Thanks!
The solution I found does not look very "pythonic" but it works.
I used the map() method:
data= my_dataset.map(lambda x,y,z:x)
labels= my_dataset.map(lambda x,y,z:y)
paths= my_dataset.map(lambda x,y,z:z)
After this separation, the order of the labels stays the same.
I am currently having medical images from two sources. One is having JPEG format while other is having TIF format, TIF format is lossless while JPEG is lossy so if I convert TIF to JPEG there is a chance of data loss or can I mix both together and use it for training the CNN.
Using Keras with Tensorflow backend.
Neural networks, and Machine Learning models in general, do not take specific file formats as input, but expect matrices/tensors of real numbers as input. For RGB images this means a tensor with dimensions (width, height, 3). When the image is read from a file, its transformed automatically into a tensor, so it does not matter which kind of file format you use.