Convert unknown labels to Yolov5 - pytorch

I own a dataset of images with unknown label format, which is:
angry_actor_104.jpg 0 28 113 226 141 22.9362 0
It indicates an image as follows:
image_name face_id_in_image face_box_top face_box_left face_box_right face_box_bottom face_box_cofidence expression_label
My question is: How can this be converted into the yolov5 format?
I have been looking this up for a long time and hope someone can help.
Thank you very much in advance.

Since the format is unknown you are unlikely to find existing code to completely handle the transformation but I can share some tips to get started.
The annotations file does not have enough info to get converted to Yolo format. Because to convert to Yolo you also need to know the dimensions of the images. If all of your images are the same dimension then it easier but if all of the images are different then you will need additional code to extract the dimensions of the images. I will explain why below.
When you are done you will need to get the images and labels in a specific directly structure like this, with one txt file per image:
/images/actor1.jpg
/images/actor2.jpg
/labels/actor1.txt
/labels/actor2.txt
This is the shape that you want to get the annotation files into.
face_id_in_image x_center_image y_center_image width height
There is a clear description of what the values mean here https://stackoverflow.com/a/66563144/5183735.
Now you need to do some math to calculate the values.
width = (face_box_right - face_box_left)/image_width
height = (face_box_bottom - face_box_top)/image_height
x_center_image = face_box_left/image_width + (width/2)
y_center_image = face_box_top/image_height + (height/2)
I have some bits of code that may help you with reading the text file and saving the text files here.
https://github.com/pylabel-project/pylabel/blob/main/pylabel/exporter.py and https://github.com/pylabel-project/pylabel/blob/main/pylabel/importer.py.
If you are able to share your exact files I may be able to identify some shortcut to transform them.

Related

ValueError: setting an array element with a sequence when calling numpy.save()

I have a list called training_data that I'd like to store in an .npy file.
Each element of the list contains a 480x270 image matrix screen and a 1x4 output list; So an element would look like so:
[screen,output]
Essentially, I'm storing an image and the action taken(the key pressed-out of 4 available options) at the instant that the image was captured from the screen to train a CNN.
While in the list format, training_data stores all my records without any issues, so this works:
training_data.append([screen,output])
But, when I try to save the list as a numpy array, into a .npy file, like so:
np.save(file_name,training_data)
I get the following error:
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (1000, 2) + inhomogeneous part.
I'm following a tutorial to create this CNN project. Admittedly, the tutorial was made a few years back (2017). Back then, the save operation worked flawlessly:
Tutorial Timestamp: 17:49
Any ideas as to why this error occurs will be greatly appreciated.
Thank you.

How to make tesseract work for different kinds of images?

I am working on a digit recognition task using Tesseract and OpenCV. I did use it and came to a solution that is specific for a particular image. If I change my image I do not obtain correct results. I should change the threshold value according to the image. The steps I did was:
Cropping the image to find an appropriate region
Change image into grayscale
Using Gaussian Blur
Taking appropriate threshold
passing the image through Tesseract
So, My question is how can I make my code generic i.e. it can be used for different images without updating my code.
While working on this image I processed as`
imggray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgBlur=cv2.GaussianBlur(imggray,(5,5), 0)
imgDil=cv2.dilate(imgBlur,np.ones((5,5),np.uint8),iterations=1)
imgEro=cv2.erode(imgDil,np.ones((5,5),np.uint8),iterations=2)
ret,imgthresh=cv2.threshold(imgEro,28,255, cv2.THRESH_BINARY )
And for this Image as
imggray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgBlur=cv2.GaussianBlur(imggray,(5,5), 0)
imgDil=cv2.dilate(imgBlur,np.ones((5,5),np.uint8),iterations=0)
imgEro=cv2.erode(imgDil,np.ones((5,5),np.uint8),iterations=0)
ret,imgthresh=cv2.threshold(imgEro,37,255, cv2.THRESH_BINARY )
I had to change the value of iterations and the minimum threshold to obtain proper results. What can be the solution so I should not change the values?

Is there a way to ignore EXIF orientation data when loading an image with PIL?

I'm getting some unwanted rotation when loading images using PIL. I'm loading image samples and their binary mask, so this is causing issues. I'm attempting to convert the code to use openCV instead, but this is proving sticky. I haven't seen any arguments in the documentation under Image.load(), but I'm hoping there's a workaround I just haven't found...
There is, but I haven't written it all up. Basically, if you load an image with EXIF "Orientation" field set, you can get that parameter.
First, a quick test using this image from the PIL GitHub source Pillow-7.1.2/Tests/images/hopper_orientation_6.jpg and run jhead on it you can see the EXIF orientation is 6:
jhead /Users/mark/StackOverflow/PillowBuild/Pillow-7.1.2/Tests/images/hopper_orientation_6.jpg
File name : /Users/mark/StackOverflow/PillowBuild/Pillow-7.1.2/Tests/images/hopper_orientation_6.jpg
File size : 4951 bytes
File date : 2020:04:24 14:00:09
Resolution : 128 x 128
Orientation : rotate 90 <--- see here
JPEG Quality : 75
Now do that in PIL:
from PIL import Image
# Load that image
im = Image.open('/Users/mark/StackOverflow/PillowBuild/Pillow-7.1.2/Tests/images/hopper_orientation_6.jpg')
# Get all EXIF data
e = im.getexif()
# Specifically get orientation
e.get(0x0112)
# prints 6
Now click on the source and you can work out how your image has been rotated and undo it.
Or, you could be completely unprofessional ;-) and create a function called SneakilyRemoveOrientationWhileNooneIsLooking(filename) and shell out (subprocess) to exiftool and remove the orientation with:
exiftool -Orientation= image.jpg
Author's "much simpler solution" detailed in above comment is misleading so I just wanna clear that up.
Pillow does not automatically apply EXIF orientation transformation when reading an image. However, it has a method to do so: PIL.ImageOps.exif_transpose(image)
OpenCV automatically applies EXIF orientation when reading an image. You can disable this behavior by using the IMREAD_IGNORE_ORIENTATION flag.
I believe the author's true intention was to apply the EXIF orientation rather than ignore it, which is exactly what his solution accomplished.

Fastbert: BertDataBunch error for multilabel text classification

I'm following the FastBert tutorial from huggingface https://medium.com/huggingface/introducing-fastbert-a-simple-deep-learning-library-for-bert-models-89ff763ad384
The problem is this the code is not exactly reproducible. The main issue I'm facing is the dataset preparation. In the tutorial, this dataset is used https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data
But, if I set-up the folder structure according the tutorial, and place the dataset files in the folders I get errors with the databunch.
databunch = BertDataBunch(args['data_dir'], LABEL_PATH, args.model_name, train_file='train.csv', val_file='val.csv',
test_data='test.csv',
text_col="comment_text", label_col=label_cols,
batch_size_per_gpu=args['train_batch_size'], max_seq_length=args['max_seq_length'],
multi_gpu=args.multi_gpu, multi_label=True, model_type=args.model_type)
It complains about the file format being wrong. How should I format the dataset, labels for this dataset with fastbert?
First of all, you can use the notebook from GitHub for FastBert.
https://github.com/kaushaltrivedi/fast-bert/blob/master/sample_notebooks/new-toxic-multilabel.ipynb
There is a small tutorial in the FastBert README on how to process the dataset before using.
Create a DataBunch object
The databunch object takes training, validation and test csv files and converts the data into internal representation for BERT, RoBERTa, DistilBERT or XLNet. The object also instantiates the correct data-loaders based on device profile and batch_size and max_sequence_length.
from fast_bert.data_cls import BertDataBunch
databunch = BertDataBunch(DATA_PATH, LABEL_PATH,
tokenizer='bert-base-uncased',
train_file='train.csv',
val_file='val.csv',
label_file='labels.csv',
text_col='text',
label_col='label',
batch_size_per_gpu=16,
max_seq_length=512,
multi_gpu=True,
multi_label=False,
model_type='bert')
File format for train.csv and val.csv
index text label
0 Looking through the other comments, I'm amazed that there aren't any warnings to potential viewers of what they have to look forward to when renting this garbage. First off, I rented this thing with the understanding that it was a competently rendered Indiana Jones knock-off. neg
1 I've watched the first 17 episodes and this series is simply amazing! I haven't been this interested in an anime series since Neon Genesis Evangelion. This series is actually based off an h-game, which I'm not sure if it's been done before or not, I haven't played the game, but from what I've heard it follows it very well pos
2 his movie is nothing short of a dark, gritty masterpiece. I may be bias, as the Apartheid era is an area I've always felt for. pos
In case the column names are different than the usual text and labels, you will have to provide those names in the databunch text_col and label_col parameters.
labels.csv will contain a list of all unique labels. In this case the file will contain:
pos
neg
For multi-label classification, labels.csv will contain all possible labels:
severe_toxic
obscene
threat
insult
identity_hate
The file train.csv will then contain one column for each label, with each column value being either 0 or 1. Don't forget to change multi_label=True for multi-label classification in BertDataBunch.
id text toxic severe_toxic obscene threat insult identity_hate
0 Why the edits made under my username Hardcore Metallica Fan were reverted? 0 0 0 0 0 0
0 I will mess you up 1 0 0 1 0 0
label_col will be a list of label column names. In this case it will be:
['toxic','severe_toxic','obscene','threat','insult','identity_hate']
So, just keep the train.csv, val.csv (just make a copy of train.csv), and test.csv inside data/
In the labels folder, keep a labels.csv with the following contents.
severe_toxic
obscene
threat
insult
identity_hate

Basic importing coordinates into R and setting projection

Ok, I am trying to upload a .csv file, get it into a spatial points data frame and set the projection system to WGS 84. I then want to determine the distance between each point This is what I have come up with but I
cluster<-read.csv(file = "cluster.csv", stringsAsFactors=FALSE)
coordinates(cluster)<- ~Latitude+Longitude
cluster<-CRS("+proj=longlat +datum=WGS84")
d<-dist2Line(cluster)
This returns an error that says
Error in .pointsToMatrix(p) :
points should be vectors of length 2, matrices with 2 columns, or inheriting from a SpatialPoints* object
But this isn't working and I will be honest that I don't fully comprehend importing and manipulating spatial data in R. Any help would be great. Thanks
I was able to determine the issue I was running into. With WGS 84, the longitude comes before the latitude. This is just backwards from how all the GPS data I download is formatted (e.g. lat-long). Hope this helps anyone else who runs into this issue!
thus the code should have been
cluster<-read.csv(file = "cluster.csv", stringsAsFactors=FALSE)
coordinates(cluster)<- ~Longitude+Latitude
cluster<-CRS("+proj=longlat +datum=WGS84")

Resources