openvino training extensions text detector train for small images - openvino

i am trying to train text detection https://github.com/opencv/openvino_training_extensions/tree/develop/tensorflow_toolkit/text_detectio... and default it is set for image size 1280 * 768 , but i want to train it on cropped vehicle number plate, i have resized my images to 200*120px size with padding keeping the aspect ratio.
is there any doc available to understand config.yaml,
some fields there are like
min_area: 300
score_map_shape: [128,128]
train_image_shape : [512,512]
can someone please explain these.
i tried setting train_image_shape with 200,120, and i got error operands could not be broadcast together with shapes (8,13,2)(8,14,2)
Thanks & Regards
Rawat

I suggest you refer to the config.py of the original implementation of PixelLink, available at the following link:
https://github.com/ZJULearning/pixel_link/blob/master/config.py
Additionally, I would also encourage you to read the paper, “PixelLink: Detecting Scene Text via Instance Segmentation”, available at the following link:
https://arxiv.org/abs/1801.01315

Related

Detection and recognition text from Tyres (python)

Hi i have bunch of images of tyres, we need to detect and recognize the text on the tyres,
here i am facing difficulties to detect the text, because the text and background of the tyre are same, i have tried with EAST text detector and yolo text detector ( without own data train ),
is there any better solutions to detect the text from these kind of background images
here i need to detect only the 10 digit serial number like "75R-0006884"
edit: pre processed image
Some preprocessing before passing the images in through a non-retrained deep network might help! Try binarizing/grayscaling the image and playing with those thresholds to see what gives maximum contrast.
But I gotta say, you might have to retrain an existing network to achieve good performance.

Using OpenCV DNN Detection for images over 300x300

I am here because I am in need of some advise...
I am working with face detection. I already tried some methods like de DLIB Detector, HoG, among others...
For now, I started to use the OpenCV DNN Detection based in the ResNet .caffemodel, but after a lot of attempts I realize that this model it is not very good for images over than 300x300 (HxW).
Note that my images are 1520x2592 (HxW). When I apply the resize, almost all information of the faces are lost because the faces in the original image are about 150x150 pixels, when resized for detection using DNN their size is about 30x20 (approx.).
Some approaches I already tried:
- Split figure in sub-figures
- Background subtraction
What I need to reach:
- Fast detection
- Reduce the number of lost faces (not detected)
Challenge:
- Big image with small faces in it
- A lot of area in the image not being used (but I can't change the location of the camera)
SSD-based networks are fully convolutional that means you can vary input size. Try to pass inputs of different sizes and choose one which give satisfying performance and accuracy. There is an example here: http://answers.opencv.org/question/202995/dnn-module-face-detection-poor-results-open-cv-343/
input = blobFromImage(img, 1.0, Size(1296, 760)); // x0.5
or
input = blobFromImage(img, 1.0, Size(648, 380)); // x0.25

Cropping a minibatch of images in Pytorch -- each image differently

I have a tensor named input with dimensions 64x21x21. It is a minibatch of 64 images, each 21x21 pixels. I'd like to crop each image down to 11x11 pixels. So the output tensor I want would have dimensions 64x11x11.
I'd like to crop each image around a different "center pixel." The center pixels are given by a 2-dimensional long tensor named center with dimensions 64x2. For image i, center[i][0] gives the row index and center[i][1] gives the column index for the pixel that should be at the center in the output. We can assume that the center pixel is always at least 5 pixels away from the border.
Is there an efficient way to do this in pytorch (on the gpu)?
UPDATE: Let me clarify that the center tensor is formed by a deep neural network. It acts as a "hard attention mechanism," to use the reinforcement learning term for it. After I "crop" an image, that subimage becomes the input to another neural network. That's why I want to do the cropping in Pytorch: because the operations before and after the cropping are in Pytorch. I'd like to avoid having to transfer anything from the GPU back to the CPU.
I raised the question over on the pytorch forums, and got an answer there from smth. The grid_sample function should totally solve the problem.
https://discuss.pytorch.org/t/cropping-a-minibatch-of-images-each-image-a-bit-differently/12247
torchvision contains transforms including RandomCrop, but it doesn't seem to fit your use case if you want the images cropped in a specific way. I would recon that PyTorch, a deep learning framework, is not the appropriate tool for cropping images.
Instead, have a look at this tutorial that uses pillow. You should be able to implement your use case with this. Also have a look at pillow-simd which does some operations faster.

Reducing / Enhancing known features in an image

I am microbiology student new to computer vision, so any help will be extremely appreciated.
This question involves microscope images that I am trying to analyze. The goal I am trying to accomplish is to count bacteria in an image but I need to pre-process the image first to enhance any bacteria that are not fluorescing very brightly. I have thought about using several different techniques like enhancing the contrast or sharpening the image but it isn't exactly what I need.
I want to reduce the noise(black spaces) to 0's on the RBG scale and enhance the green spaces. I originally was writing a for loop in OpenCV with threshold limits to change each pixel but I know that there is a better way.
Here is an example that I did in photo shop of the original image vs what I want.
Original Image and enhanced Image.
I need to learn to do this in a python environment so that I can automate this process. As I said I am new but I am familiar with python's OpenCV, mahotas, numpy etc. so I am not exactly attached to a particular package. I am also very new to these techniques so I am open to even if you just point me in the right direction.
Thanks!
You can have a look at histogram equalization. This would emphasize the green and reduce the black range. There is an OpenCV tutorial here. Afterwards you can experiment with different thresholding mechanisms that best yields the bacteria.
Use TensorFlow:
create your own dataset with images of bacteria and their positions stored in accompanying text files (the bigger the dataset the better).
Create a positive and negative set of images
update default TensorFlow example with your images
make sure you have a bunch of convolution layers.
train and test.
TensorFlow is perfect for such tasks and you don't need to worry about different intensity levels.
I initially tried histogram equalization but did not get the desired results. So I used adaptive threshold using the mean filter:
th = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 3, 2)
Then I applied the median filter:
median = cv2.medianBlur(th, 5)
Finally I applied morphological closing with the ellipse kernel:
k1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
dilate = cv2.morphologyEx(median, cv2.MORPH_CLOSE, k1, 3)
THIS PAGE will help you modify this result however you want.

Scikit-learn, image classification

This example allows the classification of images with scikit-learn:
http://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html
However, it is important that all the images have the same size (width and height, as written in the comments).
How can I modify this code to allow classification of images with different sizes?
You will need to define your own Feature Extraction.
In example from above, every pixel is represent a feature. If your images of different sizes, most trivial (but certainly not the best) thing that you can do is pad all images to the size of largest image with, for example, white pixels.
Here an example how to add boarders to image.

Resources