This is a follow-up to my previous question here
I've been trying to convert the color data in a heatmap to RGB values.
source image
In the below image, to the left is a subplot present in panel D of the source image. This has 6 x 6 cells (6 rows and 6 columns). On the right, we see the binarized image, with white color highlighted in the cell that is clicked after running the code below. The input for running the code is the below image. The ouput is(mean = [ 27.72 26.83 144.17])is the mean of BGR color in the cell that is highlighted in white on the right image below.
A really nice solution that was provided as an answer to my previous question is the following (ref)
import cv2
import numpy as np
# print pixel value on click
def mouse_callback(event, x, y, flags, params):
if event == cv2.EVENT_LBUTTONDOWN:
# get specified color
row = y
column = x
color = image[row, column]
print('color = ', color)
# calculate range
thr = 20 # ± color range
up_thr = color + thr
up_thr[up_thr < color] = 255
down_thr = color - thr
down_thr[down_thr > color] = 0
# find points in range
img_thr = cv2.inRange(image, down_thr, up_thr) # accepted range
height, width, _ = image.shape
left_bound = x - (x % round(width/6))
right_bound = left_bound + round(width/6)
up_bound = y - (y % round(height/6))
down_bound = up_bound + round(height/6)
img_rect = np.zeros((height, width), np.uint8) # bounded by rectangle
cv2.rectangle(img_rect, (left_bound, up_bound), (right_bound, down_bound), (255,255,255), -1)
img_thr = cv2.bitwise_and(img_thr, img_rect)
# get points around specified point
img_spec = np.zeros((height, width), np.uint8) # specified mask
last_img_spec = np.copy(img_spec)
img_spec[row, column] = 255
kernel = np.ones((3,3), np.uint8) # dilation structuring element
while cv2.bitwise_xor(img_spec, last_img_spec).any():
last_img_spec = np.copy(img_spec)
img_spec = cv2.dilate(img_spec, kernel)
img_spec = cv2.bitwise_and(img_spec, img_thr)
cv2.imshow('mask', img_spec)
cv2.waitKey(10)
avg = cv2.mean(image, img_spec)[:3]
mean.append(np.around(np.array(avg), 2))
print('mean = ', np.around(np.array(avg), 2))
# print(mean) # appends data to variable mean
if __name__ == '__main__':
mean = [] #np.zeros((6, 6))
# create window and callback
winname = 'img'
cv2.namedWindow(winname)
cv2.setMouseCallback(winname, mouse_callback)
# read & display image
image = cv2.imread('ip2.png', 1)
#image = image[3:62, 2:118] # crop the image to 6x6 cells
#---- resize image--------------------------------------------------
# appended this to the original code
print('Original Dimensions : ', image.shape)
scale_percent = 220 # percent of original size
width = int(image.shape[1] * scale_percent / 100)
height = int(image.shape[0] * scale_percent / 100)
dim = (width, height)
# resize image
image = cv2.resize(image, dim, interpolation=cv2.INTER_AREA)
# ----------------------------------------------------------------------
cv2.imshow(winname, image)
cv2.waitKey() # press any key to exit
cv2.destroyAllWindows()
What do I want to do next?
The mean of the RGB values thus obtained has to be mapped to the values in the following legend provided in the source image,
I would like to ask for suggestions on how to map the RGB data to the values in the legend.
Note: In my previous post it has been suggested that one could
fit the RGB values into an equation which gives continuous results.
Any suggestions in this direction will also be helpful.
EDIT:
Answering the comment below
I did the following to measure the RGB values of legend
Input image:
This image has 8 cells in columns width and 1 cell in rows height
Changed these lines of code:
left_bound = x - (x % round(width/8)) # 6 replaced with 8
right_bound = left_bound + round(width/8) # 6 replaced with 8
up_bound = y - (y % round(height/1)) # 6 replaced with 1
down_bound = up_bound + round(height/1) # 6 replaced with 1
Mean obtained for each cell/ each color in legend from left to right:
mean = [ 82.15 174.95 33.66]
mean = [45.55 87.01 17.51]
mean = [8.88 8.61 5.97]
mean = [16.79 17.96 74.46]
mean = [ 35.59 30.53 167.14]
mean = [ 37.9 32.39 233.74]
mean = [120.29 118. 240.34]
mean = [238.33 239.56 248.04]
You can try to apply piece wise approach, make pair wise transitions between colors:
c[i->i+1](t)=t*(R[i+1],G[i+1],B[i+1])+(1-t)*(R[i],G[i],B[i])
Do the same for these values:
val[i->i+1](t)=t*val[i+1]+(1-t)*val[i]
Where i - index of color in legend scale, t - parameter in [0:1] range.
So, you have continuous mapping of 2 values, and just need to find color parameters i and t closest to sample and find value from mapping.
Update:
To find the color parameters you can think about every pair of neighbour legend colors as a pair of 3d points, and your queried color as external 3d point. Now you just meed to find a length of perpendicular from the external point to a line, then, iterating over legend color pairs, find the shortest perpendicular (now you have i).
Then find intersection point of the perpendicular and the line. This point will be located at the distance A from line start and if line length is L then parameter value t=A/L.
Update2:
Simple brutforce solution to illustrate piece wise approach:
#include "opencv2/opencv.hpp"
#include <string>
#include <iostream>
using namespace std;
using namespace cv;
int main(int argc, char* argv[])
{
Mat Image=cv::Mat::zeros(100,250,CV_32FC3);
std::vector<cv::Scalar> Legend;
Legend.push_back(cv::Scalar(82.15,174.95,33.66));
Legend.push_back(cv::Scalar(45.55, 87.01, 17.51));
Legend.push_back(cv::Scalar(8.88, 8.61, 5.97));
Legend.push_back(cv::Scalar(16.79, 17.96, 74.46));
Legend.push_back(cv::Scalar(35.59, 30.53, 167.14));
Legend.push_back(cv::Scalar(37.9, 32.39, 233.74));
Legend.push_back(cv::Scalar(120.29, 118., 240.34));
Legend.push_back(cv::Scalar(238.33, 239.56, 248.04));
std::vector<float> Values;
Values.push_back(-4);
Values.push_back(-2);
Values.push_back(0);
Values.push_back(2);
Values.push_back(4);
Values.push_back(8);
Values.push_back(16);
Values.push_back(32);
int w = 30;
int h = 10;
for (int i = 0; i < Legend.size(); ++i)
{
cv::rectangle(Image, Rect(i * w, 0, w, h), Legend[i]/255, -1);
}
std::vector<cv::Scalar> Smooth_Legend;
std::vector<float> Smooth_Values;
for (int i = 0; i < Legend.size()-1; ++i)
{
cv::Scalar c1 = Legend[i];
cv::Scalar c2 = Legend[i + 1];
float v1 = Values[i];
float v2 = Values[i+1];
for (int j = 0; j < w; ++j)
{
float t = (float)j / (float)w;
Scalar c = c2 * t + c1 * (1 - t);
float v = v2 * t + v1 * (1 - t);
float x = i * w + j;
line(Image, Point(x, h), Point(x, h + h), c/255, 1);
Smooth_Values.push_back(v);
Smooth_Legend.push_back(c);
}
}
Scalar qp = cv::Scalar(5, 0, 200);
float d_min = FLT_MAX;
int ind = -1;
for (int i = 0; i < Smooth_Legend.size(); ++i)
{
float d = cv::norm(qp- Smooth_Legend[i]);
if (d < d_min)
{
ind = i;
d_min = d;
}
}
std::cout << Smooth_Values[ind] << std::endl;
line(Image, Point(ind, 3 * h), Point(ind, 4 * h), Scalar::all(255), 2);
circle(Image, Point(ind, 4 * h), 3, qp/255,-1);
putText(Image, std::to_string(Smooth_Values[ind]), Point(ind, 70), FONT_HERSHEY_DUPLEX, 1, Scalar(0, 0.5, 0.5), 0.002);
cv::imshow("Legend", Image);
cv::imwrite("result.png", Image*255);
cv::waitKey();
}
The result:
Python:
import cv2
import numpy as np
height=100
width=250
Image = np.zeros((height, width,3), np.float)
legend = np.array([ (82.15,174.95,33.66),
(45.55,87.01,17.51),
(8.88,8.61,5.97),
(16.79,17.96,74.46),
( 35.59,0.53,167.14),
( 37.9,32.39,233.74),
(120.29,118.,240.34),
(238.33,239.56,248.04)], np.float)
values = np.array([-4,-2,0,2,4,8,16,32], np.float)
# width of cell, also defines number
# of one segment transituin subdivisions.
# Larger values will give more accuracy, but will woek slower.
w = 30
# Only fo displaying purpose. Height of bars in result image.
h = 10
# Plot legend cells ( to check correcrness only )
for i in range(len(legend)):
col=legend[i]
cv2.rectangle(Image, (i * w, 0, w, h), col/255, -1)
# Start form smoorhed scales for color and according values
Smooth_Legend=[]
Smooth_Values=[]
for i in range(len(legend)-1): # iterate known knots
c1 = legend[i] # start color point
c2 = legend[i + 1] # end color point
v1 = values[i] # start value
v2 = values[i+1] # emd va;ie
for j in range(w): # slide inside [start:end] interval.
t = float(j) / float(w) # map it to [0:1] interval
c = c2 * t + c1 * (1 - t) # transition between c1 and c2
v = v2 * t + v1 * (1 - t) # transition between v1 and v2
x = i * w + j # global scale coordinate (for drawing)
cv2.line(Image, (x, h), (x, h + h), c/255, 1) # draw one tick of smoothed scale
Smooth_Values.append(v) # append smoothed values for next step
Smooth_Legend.append(c) # append smoothed color for next step
# queried color
qp = np.array([5, 0, 200])
# initial value for minimal distance set to large value
d_min = 1e7
# index for clolor search
ind = -1
# search for minimal distance from queried color to smoothed scale color
for i in range(len(Smooth_Legend)):
# distance
d = cv2.norm(qp-Smooth_Legend[i])
if (d < d_min):
ind = i
d_min = d
# ind contains index of the closest color in smoothed scale
# and now we can extract according value from smoothed values scale
print(Smooth_Values[ind]) # value mapped to queried color.
# plot pointer (to check ourself)
cv2.line(Image, (ind, 3 * h), (ind, 4 * h), (255,255,255), 2);
cv2.circle(Image, (ind, 4 * h), 3, qp/255,-1);
cv2.putText(Image, str(Smooth_Values[ind]), (ind, 70), cv2.FONT_HERSHEY_DUPLEX, 1, (0, 0.5, 0.5), 1);
# show window
cv2.imshow("Legend", Image)
# save to file
cv2.imwrite("result.png", Image*255)
cv2.waitKey()
Related
I'm struggling to create a light YoloV1 (with only one bounding box) on MNIST dataset (I randomly paste 28x28 digit into a 75x75 black background).
I can't figure out how to turn relative-to-cell coordinates into absolute coordinates.
Since now, I'm using the groundtruth bounding boxes to retrieve the cell which should contain an object, then I save the i,j positions, then I use those positions to get back to absolute coordinates with my predictions.
This method works but when it's time to detect a real image, I won't have the groundtruth coordinates and so, the i,j object position, and so the absolute position of the predicted bounding box.
I provide some line of code :
Encoding absolute coordinates of shape (N,4) to (N,S,S,5)
def encode(self, box):
"""
box : torch.Tensor of shape (N,4)
Absolute coordinates [xmin, ymin, w_bbox, h_bbox]
"""
### Absolute box infos
xmin, ymin, w_bbox, h_bbox = box
### Relative box infos
rw = w_bbox / 75
rh = h_bbox / 75
rx_min = xmin / 75
ry_min = ymin / 75
### x and y box center coords
rxc = (rx_min + rw/2)
ryc = (ry_min + rh/2)
### Object grid location
i = (rxc / self.cell_size).ceil() - 1.0
j = (ryc / self.cell_size).ceil() - 1.0
i, j = int(i), int(j)
### x & y of the cell left-top corner
x0 = i * self.cell_size
y0 = j * self.cell_size
### x & y of the box on the cell, normalized from 0.0 to 1.0.
x_norm = (rxc - x0) / self.cell_size
y_norm = (ryc - y0) / self.cell_size
box_target = torch.zeros(self.S, self.S, 4+1)
box_target[j, i, :5] = torch.Tensor([x_norm, y_norm, rw, rh, 1.])
return box_target
Convert relative-to-cell coordinates into absolute ones
def relative2absolute(box_true:torch.Tensor, box_pred:torch.Tensor)->tuple:
"""
Turns bounding box relative to cell coordinates into absolute coordinates
(pixels). Used to calculate IoU and to plot boxes.
Args:
box_true : torch.Tensor of shape (N, S, S, 5)
Groundtruth bounding box coordinates to convert.
box_pred : torch.Tensor of shape (N, S, S, 5)
Predicted bounding box coordinates to convert.
Return:
box_true_absolute : torch.Tensor of shape (N, 4)
box_pred_absolute : torch.Tensor of shape (N, 4)
"""
assert len(box_true.shape)==4 and len(box_pred.shape)==4, "Bbox should be of size (N,S,S,5)."
SIZEHW = 75
S = 6
CELL_SIZE = 1/S
### Get non-zero coordinates
cells_with_obj = box_true.nonzero()[::5]
N, cells_i, cells_j, _ = cells_with_obj.permute(1,0)
### Retrieving box coordinates. TBM if nb_obj > 1
xrcell_true, yrcell_true, rw_true, rh_true = box_true[N, cells_i, cells_j, 0:4].permute(1,0)
xrcell_pred, yrcell_pred, rw_pred, rh_pred = box_pred[N, cells_i, cells_j, 0:4].permute(1,0)
### Compute relative-to-image center coordinates
xc_rimg_true = xrcell_true * CELL_SIZE + cells_j * CELL_SIZE
xc_rimg_pred = xrcell_pred * CELL_SIZE + cells_j * CELL_SIZE
yc_rimg_true = yrcell_true * CELL_SIZE + cells_i * CELL_SIZE
yc_rimg_pred = yrcell_pred * CELL_SIZE + cells_i * CELL_SIZE
### Compute absolute top left coordinates
xmin_true = (xc_rimg_true - rw_true/2) * SIZEHW
xmin_pred = (xc_rimg_pred - rw_pred/2) * SIZEHW
ymin_true = (yc_rimg_true - rh_true/2) * SIZEHW
ymin_pred = (yc_rimg_pred - rh_pred/2) * SIZEHW
### Compute absolute bottom right coordinates
xmax_true = xmin_true + rw_true*SIZEHW
xmax_pred = xmin_pred + rw_pred*SIZEHW
ymax_true = ymin_true + rh_true*SIZEHW
ymax_pred = ymin_pred + rh_pred*SIZEHW
### Stacking
box_true_absolute = torch.stack((xmin_true, ymin_true, xmax_true, ymax_true), dim=-1)
box_pred_absolute = torch.stack((xmin_pred, ymin_pred, xmax_pred, ymax_pred), dim=-1)
return box_true_absolute, box_pred_absolute
I am trying to create a colormap that should linearly vary according to a "w" value, from white-red to white-purple.
So...
For w = 1, the minimum value's color (0 for example) would be white and the maximum value's color (+ inf) would be red.
For w = 10 (example), the minimum value's color (0 for example) would be white and the maximum value's color (+ inf) would be orange.
For w = 30 (example), the minimum value's color (0 for example) would be white and the maximum value's color (+ inf) would be yellow.
and so on, until...
For w = 100 (example), the minimum value's color (0 for example) would be white and the maximum value's color (+ inf) would be purple.
I used this website to generate the image : https://g.co/kgs/utJPmw
I can get the first (w = 1) color map by using this code, but no idea on how to make it vary according to what I would like to :
import matplotlib.cm as cm
from matplotlib.colors import ListedColormap, LinearSegmentedColormap
color_map_1 = cm.get_cmap('Reds', 256)
newcolors_1 = color_map_1(np.linspace(0, 1, 256))
color_map_1 = ListedColormap(newcolors_1)
Any idea to do such a thing in python would be so much welcome,
Thank you guys
I finally found the solution. Maybe this is not the cleanest way, but it works very well for what I want to do. The colormaps I create can vary from white-red to white-purple (color spectrum). 765 variations are possible here, but by adding some small changes to the code, it could vary much more or less, depending on what you want.
In the following code : using the create_custom_colormap function, you get as an output cmap and color_map. cmap is the matrix containing the (r,g,b) values. color_map is the object that can be used in matplotlib (imshow) as an actual colormap, on any image.
Using the following code, define the function we will need for this job:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.colors import ListedColormap, LinearSegmentedColormap
def create_image():
'''
Create some random image on which we will apply the colormap. Any other image could replace this one, with or without extent.
'''
dx, dy = 0.015, 0.05
x = np.arange(-4.0, 4.0, dx)
y = np.arange(-4.0, 4.0, dy)
X, Y = np.meshgrid(x, y)
extent = np.min(x), np.max(x), np.min(y), np.max(y)
def z_fun(x, y):
return (1 - x / 2 + x**5 + y**6) * np.exp(-(x**2 + y**2))
Z2 = z_fun(X, Y)
return(extent, Z2)
def create_cmap(**kwargs):
'''
Create a color matrix and a color map using 3 lists of r (red), g (green) and b (blue) values.
Parameters:
- r (list of floats): red value, between 0 and 1
- g (list of floats): green value, between 0 and 1
- b (list of floats): blue value, between 0 and 1
Returns:
- color_matrix (numpy 2D array): contains all the rgb values for a given colormap
- color_map (matplotlib object): the color_matrix transformed into an object that matplotlib can use on figures
'''
color_matrix = np.empty([256,3])
color_matrix.fill(0)
color_matrix[:,0] = kwargs["r"]
color_matrix[:,1] = kwargs["g"]
color_matrix[:,2] = kwargs["b"]
color_map = ListedColormap(color_matrix)
return(color_matrix, color_map)
def standardize_timeseries_between(timeseries, borne_inf = 0, borne_sup = 1):
'''
For lisibility reasons, I defined r,g,b values between 0 and 255. But the matplotlib ListedColormap function expects values between 0 and 1.
Parameters:
timeseries (list of floats): can be one color vector in our case (either r, g o r b)
borne_inf (int): The minimum value in our timeseries will be replaced by this value
borne_sup (int): The maximum value in our timeseries will be replaced by this value
'''
timeseries_standardized = []
for i in range(len(timeseries)):
a = (borne_sup - borne_inf) / (max(timeseries) - min(timeseries))
b = borne_inf - a * min(timeseries)
timeseries_standardized.append(a * timeseries[i] + b)
timeseries_standardized = np.array(timeseries_standardized)
return(timeseries_standardized)
def create_custom_colormap(weight):
'''
This function is at the heart of the process. It takes only one < weight > parameter, that you can chose.
- For weight between 0 and 255, the colormaps that are created will vary between white-red (min-max) to white-yellow (min-max).
- For weight between 256 and 510, the colormaps that are created will vary between white-green (min-max) to white-cyan (min-max).
- For weight between 511 and 765, the colormaps that are created will vary between white-blue (min-max) to white-purple (min-max).
'''
if weight <= 255:
### 0>w<255
r = np.repeat(1, 256)
g = np.arange(0, 256, 1)
g = standardize_timeseries_between(g, weight/256, 1)
g = g[::-1]
b = np.arange(0, 256, 1)
b = standardize_timeseries_between(b, 1/256, 1)
b = b[::-1]
if weight > 255 and weight <= 255*2:
weight = weight - 255
### 255>w<510
g = np.repeat(1, 256)
r = np.arange(0, 256, 1)
r = standardize_timeseries_between(r, 1/256, 1)
r = r[::-1]
b = np.arange(0, 256, 1)
b = standardize_timeseries_between(b, weight/256, 1)
b = b[::-1]
if weight > 255*2 and weight <= 255*3:
weight = weight - 255*2
### 510>w<765
b = np.repeat(1, 256)
r = np.arange(0, 256, 1)
r = standardize_timeseries_between(r, weight/256, 1)
r = r[::-1]
g = np.arange(0, 256, 1)
g = standardize_timeseries_between(g, 1/256, 1)
g = g[::-1]
cmap, color_map = create_cmap(r=r, g=g, b=b)
return(cmap, color_map)
Use the function create_custom_colormap to get the colormap you want, by giving as argument to the function a value between 0 and 765 (see 5 examples in the figure below):
### Let us create some image (any other could be used).
extent, Z2 = create_image()
### Now create a color map, using the w value you want 0 = white-red, 765 = white-purple.
cmap, color_map = create_custom_colormap(weight=750)
### Plot the result
plt.imshow(Z2, cmap =color_map, alpha=0.7,
interpolation ='bilinear', extent=extent)
plt.colorbar()
As you can guess from the title, I am trying to solve the following problem.
Given a grid of size NxN and a circular object O of radius R with centre C at (x_c, y_c), find which Blocks are occupied by O.
An example is shown in the figure below:
In that example, I expect the output to be [1,2,5,6].
I would be very grateful if anyone has a suggestion or resources.
Find the range of rows affected:
miny = floor(y_c-r);
maxy = ceil(y_c+r)-1;
For each row, find the range of columns by intersecting the circle with the horizontal line through it that has the largest intersection. There are 3 cases:
for (y=miny; y<=maxy; ++y) {
if (y+1 < y_c)
ytest = y+1;
else if (y > y_c)
ytest = y;
else
ytest = y_c;
// solve (x-x_c)^2 + (ytest-y_c)^2 = r^2
ydist2 = (ytest-y_c)*(ytest-y_c);
xdiff = sqrt(r*r - ydist2);
minx = floor(x_c - xdiff);
maxx = ceil(x_c + xdiff)-1;
for (x=minx; x<=maxx; ++x)
output(x,y);
}
I used Python3 and OpenCv but it can be done in any language.
Source:
import cv2
import numpy as np
import math
def drawgrid(im,xv,yv,cx,cy,r):
#params: image,grid_width,grid_height,circle_x,circle_y,circle_radius
cellcoords = set() #i use set for unique values
h,w,d = im.shape
#cell width,height
cew = int(w/xv)
ceh = int(h/yv)
#center of circle falls in this cells's coords
nx = int(cx / cew )
ny = int(cy / ceh )
cellcoords.add((nx,ny))
for deg in range(0,360,1):
cirx = cx+math.cos(deg)*r
ciry = cy+math.sin(deg)*r
#find cell coords of the circumference point
nx = int(cirx / cew )
ny = int(ciry / ceh )
cellcoords.add((nx,ny))
#grid,circle colors
red = (0,0,255)
green = (0,255,0)
#drawing red lines
for ix in range(xv):
lp1 = (cew * ix , 0)
lp2 = (cew * ix , h)
cv2.line(im,lp1,lp2,red,1)
for iy in range(yv):
lp1 = (0 , ceh * iy)
lp2 = (w , ceh * iy)
cv2.line(im,lp1,lp2,red,1)
#drawing green circle
cpoint = (int(cx),int(cy))
cv2.circle(im,cpoint,r,green)
print("cells coords:",cellcoords)
imw=500
imh=500
im = np.ndarray((imh,imw,3),dtype="uint8")
drawgrid(im,9,5, 187,156 ,50)
cv2.imshow("grid",im)
cv2.waitKey(0)
output: cells coords: {(3, 2), (3, 1), (2, 1), (2, 2), (4, 1)}
cells coords are zero based x,y.
So ...
1° cell top left is at (0,0)
2° cell is at (1,0)
3° cell is at (2,0)
1° cell of 2° row is at (0,1)
2° cell of 2° row is at (1,1)
3° cell of 2° row is at (2,1)
and so on ...
Getting cell number from cell coordinates might be fun for you
I am using Connected components(NN) method to detect and correct the skew document.I have an image of skew document.I have done the following steps :
1.document image preprocessing.
2.elegible connected components
def imshow(image1):
plt.figure(figsize=(20,10)
plt.imshow(image1)
output = cv2.connectedComponentsWithStats(invr_binary, connectivity, cv2.CV_32S)
(numLabels, labels, stats, centroids) = output
## non text removal
w_avg=stats[1:, cv2.CC_STAT_WIDTH].mean()
h_avg=stats[1: , cv2.CC_STAT_HEIGHT].mean()
B_max=(w_avg * h_avg) * 4
B_min=(w_avg * h_avg) * 0.25
result = np.zeros((labels.shape), np.uint8)
output1=image.copy()
a, b=0.6, 2
for i in range(0, numLabels - 1):
area=stats[i, cv2.CC_STAT_AREA]
if area>B_min and area<B_max: ## non text removal
result[labels == i + 1] = 255
x = stats[i, cv2.CC_STAT_LEFT]
y = stats[i, cv2.CC_STAT_TOP]
w = stats[i, cv2.CC_STAT_WIDTH]
h = stats[i, cv2.CC_STAT_HEIGHT]
area = stats[i, cv2.CC_STAT_AREA]
(cX, cY) = centroids[i]
c=w/h
if a<c and c<b: ## A and C type filtering
result[labels == i + 1] = 255
cv2.rectangle(output1, (x, y), (x + w, y + h), (0, 255, 0), 1)
cv2.circle(output1, (int(cX), int(cY)), 1, (0, 0, 255), -1)
imshow(output1)
input image :
output image :
After finding the center points of the text which is shown in the output image.Now is the next step skew slop calculation. But I could not understand that how to calculate that.I am using that research papers link :3.3(page no. 7)
https://www.mdpi.com/2079-9292/9/1/55/pdf
I have been trying to recognise handwritten letters (digits/alphabet) from a form-document. As it is known that form-documents have 1d row cells, where the applicant has to fill their information within those bounded cells. However, I'm unable to segment the digits(currently my input consists only digits) from the bounding boxes.
I went through the following steps:
Reading the image (as a grayscale image) via "imread" method of opencv2. Initial Image size:19 x 209(in pixels).
pic = "crop/cropped000.jpg"
newImg = cv2.imread(pic, 0)
Resizing the image 200% its original size via "resize" method of opencv2. I used INTER_AREA Interpolation. Resized Image size: 38 x 418(in pixels)
h,w = newImg.shape
resizedImg = cv2.resize(newImg, (2*w,2*h), interpolation=cv2.INTER_AREA)
Applied Canny edge detection.
v = np.median(resizedImg)
sigma = 0.33
lower = int(max(0, (1.0 - sigma) * v))
upper = int(min(255, (1.0 + sigma) * v))
edgedImg = cv2.Canny(resizedImg, lower, upper)
Cropped the contours and saved them as images in 'BB' directory.
im2, contours, hierarchy = cv2.findContours(edgedImg.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
num = 0
for c in contours:
x, y, w, h = cv2.boundingRect(c)
num += 1
new_img = resizedImg[y:y+h, x:x+w]
cv2.imwrite('BB/'+str(num).zfill(3) + '.jpg', new_img)
Entire code in summary:
pic = "crop/cropped000.jpg"
newImg = cv2.imread(pic, 0)
h,w = newImg.shape
print(newImg.shape)
resizedImg = cv2.resize(newImg, (2*w,2*h), interpolation=cv2.INTER_AREA)
print(resizedImg.shape)
v = np.median(resizedImg)
sigma = 0.33
lower = int(max(0, (1.0 - sigma) * v))
upper = int(min(255, (1.0 + sigma) * v))
edgedImg = cv2.Canny(resizedImg, lower, upper)
im2, contours, hierarchy = cv2.findContours(edgedImg.copy(),cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
num = 0
for c in contours:
x, y, w, h = cv2.boundingRect(c)
num += 1
new_img = resizedImg[y:y+h, x:x+w]
cv2.imwrite('BB/'+str(num).zfill(3) + '.jpg', new_img)
Images produced are posted here:
https://imgur.com/a/GStIcdj
I had to double the image size because Canny edge detection was producing double-edges for an object (However, it still does). I have also played with other openCV functionalities like Thresholding, Gaussian Blur, Dilate, Erode but all in vain.
# we need one more parameter for Date cell width : as this could be different for diff bank
def crop_image_data_from_date_field(image, new_start_h, new_end_h, new_start_w, new_end_w, cell_width):
#for date each cell has same height and width : here width: 25 px so cord will be changed based on width
cropped_image_list = []
starting_width = new_start_w
for i in range(1,9): # as date has only 8 fields: DD/MM/YYYY
cropped_img = image[new_start_h:new_end_h, new_start_w + 1 :new_start_w+22]
new_start_w = starting_width + (i*cell_width)
cropped_img = cv2.resize(cropped_img, (28, 28))
image_name = 'cropped_date/cropped_'+ str(i) + '.png'
cv2.imwrite(image_name, cropped_img)
cropped_image_list.append(image_name)
# print('cropped_image_list : ',cropped_image_list,len(cropped_image_list))
# rec_value = handwritten_digit_recog.recog_digits(cropped_image_list)
recvd_value = custom_predict.predict_digit(cropped_image_list)
# print('recvd val : ',recvd_value)
return recvd_value
you need to specify each cell width and it's x,y,w,h.
I think this will help you.