I've noticed that for any tutorial or example of a Keras CNN that I've seen, the input images are numbered, e.g.:
dog0001.jpg
dog0002.jpg
dog0003.jpg
...
Is this necessary?
I'm working with an image dataset with fairly random filenames (the classes come from the directory name), e.g.:
picture_A2.jpg
image41110.jpg
cellofinterest9A.jpg
I actually want to keep the filenames because they mean something to me, but do I need to append sequential numbers to my image files?
No they can be of different names, it really depends on how you load your data. In your case, you can use flow_from_directory to generate the training data and indeed the directory will be the associated class, this is part of ImageDataGenerator.
Related
I have a dataset that is already labeled with specific class names and it is saved on my Computer as:
Train Dataset :
-5_1
-5_2
-5_3
etc...
Where the subfolders(5_1, 5_2, etc.) are the classes of the images. I want to use semi-supervised training where both labeled and unlabeled images must be used. But I don’t know how to “erase” classes from my Dataset in order make them unlabeled and load them to my CNN.
For the labeled images I use datasets.ImageFolder() and DataLoader() so I can load them for training.
Thanks for the help!
PS1: I thought to save them in a different folder named as “Unlabeled” but I am sure that this is gonna use the name of the folder as a new class and this is something that it’s gonna ruin the predictions in training as well in testing
PS2: I must inform you that in this specific time I can't use any other pretrained dataset as CIFAR or MNIST where they already have unlabeled data.
I tried to create my own dataset as a new class but I am confused to the point where I must delete the classes.
I am using COBRE brain MRI dataset containing Nifti files. I can visualize them but could not understand how to use them in deep learning with the correct format. I read Nilearn documentation but they have used only one example of .nii file for 1 subject. The question is how to give 100 .nii files to a CNN?
The second thing is how to determine which slice of the file should be used? Should it be the middle of them? Nifti file consists of 150 slices for each subject's brain.
The third thing is how to provide the model with labels? The dataset doesn't contain any mask. How to give the model specific label for a specific file? Should I create a csv file with path of .nii files and their associated label?
Please explain me or suggest me some resources for the same.
hi i recently got into processing of nii files for one of my projects. i could get a break though to some level (preprocessing) not yet to model level.
for your second question, usually an expert visualise the niis and provide the location(s) of the roi(region of interest)
I am currently in the process of parsing the nii into csv format with labels. so the answer to your third question is , we lable the coordinates (x,y,z,c,t) as per the roi locations . (i may need to correct this understanding as i advance on need basis but for now this is the approach to feed the dataset to model i am goin to follow)
FitBERT is an useful package , but I have a small doubt on BERT development for masked word prediction as below: I trained a bert model with custom corpus using Google's Scripts like create_pretraining_data.py, run_pretraining.py, extract_features.py etc..as a result I got vocab file, .tfrecord file, .json file and check point files.
Now how to use those file for your package to predict a masked word in a given sentence??
From the tensorflow documentation:
A TFRecord file stores your data as a sequence of binary strings. This means you need to specify the structure of your data before you write it to the file. Tensorflow provides two components for this purpose: tf.train.Example and tf.train.SequenceExample. You have to store each sample of your data in one of these structures, then serialize it and use a tf.python_io.TFRecordWriter to write it to disk.
This document along with the tensorflow documentation explain quite well how to use those file types.
While instead to use FitBERT directly through the library you can follow the examples you find on the project's github.
I have a set of images like this
And I'm trying to train TensoFlow on python to read the numbers on the images.
I'm new to machine learn and on my research I found a solution to a similar problem that uses CTC to train/predict variable length data on an image.
I'm trying to figure out if I should use CTC or find a way to create a new image for every number of the image that I already have.
Like if the number of my image is 213, then I create 3 new images to train the model with the respective numbers 2, 1, 3 also using them as labels. I'm looking for tutorials or even TensorFlow documentation that can help me on that.
in the case of text CTC absolutely makes sense: you don't want to split a text (like "213") into "2", "1", "3" manually, because it often is difficult to segment the text into individual characters.
CTC, on the other hand, just needs images and the corresponding ground-truth texts as input for training. You don't have to manually take care of things like alignment of chars, width of chars, number of chars. CTC handles that for you.
I don't want to repeat myself here, so I just point you to the tutorials I've written about text recognition and to the source code:
Build a Handwritten Text Recognition System using TensorFlow
SimpleHTR: a TensorFlow model for text-recognition
You can use the SimpleHTR model as a starting point. To get good results, you will have to generate training data (e.g. write a rendering tool which renders realsitic looking examples) and train the model from scratch with that data (more details on training can be found in the README).
In the article for creating a dataaset for the TF object detection API [link], users are asked to store an object mask as:
a repeated list of single-channel encoded PNG strings, or a single dense 3D binary tensor where masks corresponding to each object are stacked along the first dimension
Since the article strongly suggests using a repeated list of single-channel encoded PNG strings, I would particularly be interested in knowing how to encode this. My annotations are typically from csv files, which I have no problem in generating the TFRecords file. Are there any instructions somewhere on how to make this conversion?
i make it works with pet dataset , on tensorflow you have 2 way with coco dataset tf record and with pet_tfrecord.
the first took JSON file
the second take XML and PNG
there is one application VGG could make annotations in PNG or in JSON, then you use the directory tree needed, i used the pet dataset example. but finally mask is not displayed, even with the example dataset...
Rather than the array of png, I ended up using a dense tensor, where each pixel value represents a class.
Note,, I’ve seen many other people who didn’t use sequential numbers and ended up having problems later. The idea makes sense, if I have 2 classes, label 1 as 0 and the other as 255. The rationale is that when you see this in grayscale it is obvious what gets labeled 1 or 255. This is impossible when you use 0,1,2,... However, this violates a lot of assumptions in downstream code (e.g. deeplab)