Generating bounding boxes if I have the text labels and co-ordinates of bounding boxes - conv-neural-network

I am trying to implement the Convolutional character network or CharNet model
https://github.com/MalongTech/research-charnet
But I want to generate the bounding boxes in the images, but int he results I have only the co-ordinates and the character labels
Like the example above, so how can I generate the bounding boxes. Please help me out here.
Thanks in advance

Related

Word Bounding Boxes of Azure OCR results are shifted to the left?

I am using the Azure OCR form recognizer to perform OCR. When I draw the line bounding boxes, it works great, but when I use the word bounding boxes, they are slightly shifted to the left.
For example, the line bounding boxes (ignore the red box) would look like this:lineocr
But when I draw the word bounding boxes from the same OCR results the result is shifted as follows:wordocr
Would anyone happen to know a solution for this problem, or maybe a nice workaround?
I have tried shifting the box by a certain percentage of the width of the bounding box but I would prefer to get the correct bounding box. The line bounding boxes have correct edges and I would expect the words to have them as well.

How to get the pixels values and coordinates of the objects unside bounding boxes - Yolov5?

I'm trying to get the pixels values and coordinates of the objects inside the bounding boxes of yolov5, Is there any method to do it ? Thanks

Rectangular connected component extraction in python

There are multiple rectangular areas in the 2d-numpy array. All the rectangular areas have value 1, other areas are zero. I want to extract a minimum number of rectangular connected components from the numpy array. These connected components can touch each other in any direction.
I tried extracting connected components using label function from scipy.ndimage.measurements but it assigns the same label to rectangles which touch each other.
I also tried, morphological opening but I do not want to lose the original shape of the rectangle.
The image shows the expected output for a better understanding of the problem.
Is there a better way to extract a minimum number of perfectly rectangular regions?

Grouping bounding boxes and Separating them - PYTHON

So I have this output image of an arabic text, I want to group the small bounding boxes to the bigger ones, then I want to separate overlapping boxes
I have no clue how to start
This is an example of what I want to do

Finding a point clicked in a grid

Given this grid ( http://i.stack.imgur.com/Nz39I.jpg is a trapezium/trapezoid, not a square), how do you find the point clicked by the user? I.e. When the user clicks a point in the grid, it should return the coordinates like A1 or D5.
I am trying to write pseudo code for this and I am stuck. Can anyone help me? Thanks!
EDIT: I am still stuck... Does anyone know of any way to find the height of the grid?
If it is a true perspective projection, you can run the click-point through the inverse projection to find it's X,Z coordinates in the 3D world. That grid has regular spacing and you can use simple math to get the A1,D5,etc.
If it's just something you drew, then you'll have to compare the Y coordinates to the positions of the horizontal lines to figure out which row. Then you'll need to check its position (left/right) relative to the angled lines to get the column - for that, you'll need either coordinates of the end-points, or equations for the lines.
Yet another option is to store an identical image where each "square" is flood-filled with a different color. You then check the color of the pixel where the user clicked but in this alternate image. This method assumes that it's a fixed image and is the least flexible.
If you have the coordinates of end points of the grid lines then
Try using the inside-outside test for each grid line and find the position
Since this grid is just a 3D view of a 2D grid plane, there is a projective transform that transforms the coordinates on the grid into coordinates on the 2D plane. To find this transform, it is sufficient to mark 4 different points on the plane (say, the edges), assign them coordinates on the 2D plane and solve the resulting linear equation system.

Resources