I am trying to make a program that automatically corrects the perspective of a rectangle. I have managed to get the silhouette of the rectangle, and have the code to correct the perspective, but I can't find the corners. The biggest problem is that, because it has been deformed, I can't use the following "code":
c1 = min(x), min(y)
c2 = max(x), min(y)
c3 = min(x), max(y)
c4 = max(x), max(y)
This wouldn't work with this situation (X represents a corner):
X0000000000X
.00000000000
..X000000000
.....0000000
........0000
...........X
Does anyone know how to do this?
Farthest point from the center will give you one corner.
Farthest point from the first corner will give you another corner, which may be either adjacent or opposite to the first.
Farthest point from the line between those two corners (a bit more math intensitive) will give you a third corner. I'd use distance from center as a tie breaker.
For finding the 4th corner, it'll be the point outside the triangle formed by the first 3 corners you found, farthest from the nearest line between those corners.
This is a very time consuming way to do it, and I've never tried it, but it ought to work.
You could try to use a scanline algorithm - For every line of the polygon (so y = min(y)..max(y)), get l = min(x) and r = max(x). Calculate the left/right slope (deltax) and compare it with the slope the line before. If it changed (use some tolerance here), you are at a corner of the rectangle (or close to it). That won't work for all cases, as the slope can't be that exact because of low resolution, but for large rectangles and slopes not too similar, this should work.
At least, it works well for your example:
X0000000000X l = 0, r = 11
.00000000000 l = 1, r = 11, deltaxl = 1, deltaxr = 0
..X000000000 l = 2, r = 11, deltaxl = 1, deltaxr = 0
.....0000000 l = 5, r = 11, deltaxl = 3, deltaxr = 0
........0000 l = 8, r = 11, deltaxl = 3, deltaxr = 0
...........X l = 11, r = 11, deltaxl = 3, deltaxr = 0
You start with the top of the rectangle where you get two different values for l and r, so you already have two of the corners. On the left side, for the first three lines you'll get deltax = 1, but after it, you'll get deltax = 3, so there is a corner at (3, 3). On the right side, nothing changes, deltax = 0, so you only get the point at the end.
Note that you're "collecting" corners here, so if you don't have 4 corners at the end, the slopes were too similar (or you have a picture of a triangle) and you can switch to a different (more exact) algorithm or just give an error. The same if you have more than 4 corners or some other strange things like holes in the rectangle. It seems some kind of image detection is involved, so these cases can occur, right?
There are cases in which a simple deltax = (x - lastx) won't work good, see this example for the left side of a rectangle:
xxxxxx
xxxxx deltax = 1 dy/dx = 1/1 = 1
xxxxx deltax = 0 dy/dx = 2/1 = 2
xxxx deltax = 1 dy/dx = 3/2 = 1.5
xxxx deltax = 0 dy/dx = 4/2 = 2
xxx deltax = 1 dy/dx = 5/3 = 1.66
Sometimes deltax is 0, sometimes is 1. It's better to use the slope of the line from the actual point to the top left/right point (deltay / deltax). Using it, you'll still have to stick with a tolerance, but your values will get more exact with each new line.
You could use a hough transform to find the 4 most prominent lines in the masked image. These lines will be the sides of the quadrangle.
The lines will intersect in up to 6 points, which are the 4 corners and the 2 perspective vanishing points.
These are easy to distinguish: pick any point inside the quadrangle, and check if the line from this point to each of the 6 intersection points intersects any of the lines. If not, then that intersection point is a corner.
This has the advantage that it works well even for noisy or partially obstructed images, or if your segmentation is not exact.
en.wikipedia.org/wiki/Hough_transform
Example CImg Code
I would be very interested in your results. I have been thinking about writing something like this myself, to correct photos of paper sheets taken at an angle. I am currently struggling to think of a way to correct the perspective if the 4 points are known
p.s.
Also check out
Zhengyou Zhang , Li-Wei He, "Whiteboard scanning and image enhancement"
http://research.microsoft.com/en-us/um/people/zhang/papers/tr03-39.pdf
for a more advanced solution for quadrangle detection
I have asked a related question, which tries to solve the perspective transform:
proportions of a perspective-deformed rectangle
This looks like a convex hull problem.
http://en.wikipedia.org/wiki/Convex_hull
Your problem is simpler but the same solution should work.
Related
I have an image which contains lot of boxes in it. My task is to find by how many degree that image is rotated. I am using open-cv for solving that task: first finding lines in the image, and then finding their angles and returning angles by taking mean or median as result.
img_gray = cv2.cvtColor(img_before, cv2.COLOR_BGR2GRAY)
img_edges = cv2.Canny(img_gray, 100, 100, apertureSize=3)
lines = cv2.HoughLinesP(img_edges, 1, math.pi / 180.0, 100, minLineLength=1000, maxLineGap=100)
angles = []
for [[x1, y1, x2, y2]] in lines:
angle = math.degrees(math.atan2(double(y2) - y1, double(x2) - x1))
This code I am using for finding the angle and returning the result, and for part 2:
(h, w) = img_before.shape[:2]
(cX, cY) = (w // 2, h // 2)
M = cv2.getRotationMatrix2D((cX, cY),angle, 1.0)
img_before = cv2.warpAffine(img_before, M, (w, h))
I am using this part for rotating the image by myself and checking whether I am getting the correct answer or not.
So, now, here comes the problem. My code is working correctly: I am getting an error less than 0.01 degree.
But when I am rotating an image with an integer angle - always I am getting the correct answer. But, when I am rotating it with some floating number (eg: 1.3, 1.5), I am getting worse results. Sometimes, it don't deduct a single line where for integer angle it deduct more than 2000 lines. Sometimes, I get 45 degree as resultant angle of line though I am rotating the image which is 2.5, but when I do it 2 or 3 I am getting the correct result.
Kindly help me with this. Alternatively, is there any other solution that I can use for my problem?
I am trying to find the direction of triangles in an image. below is the image:
These triangles are pointing upward/downward/leftward/rightward. This is not the actual image. I have already used canny edge detection to find edges then contours and then the dilated image is shown below.
My logic to find the direction:
The logic I am thinking to use is that among the three corner coordinates If I can identify the base coordinates of the triangle (having the same abscissa or ordinates values coordinates), I can make a base vector. Then angle between unit vectors and base vectors can be used to identify the direction. But this method can only determine if it is up/down or left/right but cannot differentiate between up and down or right and left. I tried to find the corners using cv2.goodFeaturesToTrack but as I know it's giving only the 3 most effective points in the entire image. So I am wondering if there is other way to find the direction of triangles.
Here is my code in python to differentiate between the triangle/square and circle:
#blue_masking
mask_blue=np.copy(img1)
row,columns=mask_blue.shape
for i in range(0,row):
for j in range(0,columns):
if (mask_blue[i][j]==25):
mask_blue[i][j]=255
else:
mask_blue[i][j]=0
blue_edges = cv2.Canny(mask_blue,10,10)
kernel_blue = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(2,2))
dilated_blue = cv2.dilate(blue_edges, kernel)
blue_contours,hierarchy =
cv2.findContours(dilated_blue,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
for cnt in blue_contours:
area = cv2.contourArea(cnt)
perimeter = cv2.arcLength(cnt,True)
M = cv2.moments(cnt)
cx = int(M['m10']/M['m00'])
cy = int(M['m01']/M['m00'])
if(12<(perimeter*perimeter)/area<14.8):
shape="circle"
elif(14.8<(perimeter*perimeter)/area<18):
shape="squarer"
elif(18<(perimeter*perimeter)/area and area>200):
shape="triangle"
print(shape)
print(area)
print((perimeter*perimeter)/area,"\n")
cv2.imshow('mask_blue',dilated_blue)
cv2.waitKey(0)
cv2.destroyAllWindows()
Source image can be found here: img1
Please help, how can I found the direction of triangles?
Thank you.
Assuming that you only have four cases: [up, down, left, right], this code should work well for you.
The idea is simple:
Get the bounding rectangle for your contour. Use: box = cv2.boundingRect(contour_pnts)
Crop the image using the bounding rectangle.
Reduce the image vertically and horizontally using the Sum option. Now you have the sum of pixels along each axis. The axis with the largest sum determines whether the triangle base is vertical or horizontal.
To identify whether the triangle is pointing left/right or up/down: you need to check whether the bounding rectangle center is before or after the max col/row:
The code (assumes you start from the cropped image):
ver_reduce = cv2.reduce(img, 0, cv2.REDUCE_SUM, None, cv2.CV_32F)
hor_reduce = cv2.reduce(img, 1, cv2.REDUCE_SUM, None, cv2.CV_32F)
#For smoothing the reduced vector, could be removed
ver_reduce = cv2.GaussianBlur(ver_reduce, (3, 1), 0)
hor_reduce = cv2.GaussianBlur(hor_reduce, (1, 3), 0)
_,ver_max, _, ver_col = cv2.minMaxLoc(ver_reduce)
_,hor_max, _, hor_row = cv2.minMaxLoc(hor_reduce)
ver_col = ver_col[0]
hor_row = hor_row[1]
contour_pnts = cv2.findNonZero(img) #in my code I do not have the original contour points
rect_center, size, angle = cv2.minAreaRect(contour_pnts )
print(rect_center)
if ver_max > hor_max:
if rect_center[0] > ver_col:
print ('right')
else:
print ('left')
else:
if rect_center[1] > hor_row:
print ('down')
else:
print ('up')
Photos:
Well, Mark has mentioned a solution that may not be as efficient but perhaps more accurate. I think this one should be equally efficient but perhaps less accurate. But since you already have a code that finds triangles, try adding the following code after you have found triangle contour:
hull = cv2.convexHull(cnt) # convex hull of contour
hull = cv2.approxPolyDP(hull,0.1*cv2.arcLength(hull,True),True)
# You can double check if the contour is a triangle here
# by something like len(hull) == 3
You should get 3 hull points for a triangle, these should be the 3 vertices of your triangles. Given your triangles always 'face' only in 4 directions; Y coordinate of the hull will have close value to the Y coordinate of the centroid for triangle facing left or right and whether it's pointing left or right will depend on whether hull X is less than or greater than centroid X. Similarly use hull and centroid X and Y for triangle pointing up or down.
How can I transform the blue curve values into linear (red curve)? I am doing some tests in excel, but basically I have those blue line values inside a 3D App that I want to manipulate with python so I can make those values linear. Is there any mathematical approach that I am missing?
The x axis goes from 0 to 90, and the y axis from 0 to 1.
For example: in the middle of the graph the blue line gives me a value of "0,70711", and I know that in linear it is "0,5". I was wondering if there's an easy formula to transform all the incoming non-linear values into linear.
I have no idea what "formula" is creating that non-linear blue line, also ignore the yellow line since I was just trying to "reverse engineer" to see if would lead me to any conclusion.
Thank you
Find a linear function y = ax + b that for x = 0 gives the value 1 and for x = 90 gives 0, just like the function that is represented by a blue curve.
In that case, your system of equations is the following:
1 = b // for x = 0
0 = a*90 + b // for x = 90
Solution provided by solver is the following : { a = -1/90, b = 1 }, the red linear function will have form y = ax + b, we put the values of a and b we found from the solver and we discover that the linear function you are looking for is y = -x/90 + 1 .
The tool I used to solve the system of equations:
http://wims.unice.fr/wims/en_tool~linear~linsolver.en.html
What exactly do you mean? You can calculate points on the red line like this:
f(x) = 1-x/90
and the point then is (x,f(x)) = (x, 1-x/90). But to be honest, I think your question is still rather unclear.
EDIT - Thanks for all the answers everyone. I think I accidentally led you slightly wrong as the square in the picture below should be a rectangle (I see most of you are referencing squares which seems like it would make my life a lot easier). Also, the x/y lines could go in any direction, so the red dot won't always be at the top y boundary. I was originally going for a y = mx + b solution, but then I got stuck trying to figure out how I know whether to plug in the x or the y (one of them has to be known, obviously).
I have a very simple question (I think) that I'm currently struggling with for some reason. I'm trying to have a type of minimap in my game which shows symbols around the perimeter of the view, pointing towards objectives off-screen.
Anyway, I'm trying to find the value of the red point (while the black borders and everything in green is known):
It seems like simple trigonometry, but for some reason I can't wrap my head around it. I just need to find the "new" x value from the green point to the red point, then I can utilize basic math to get the red point, but how I go about finding that new x is puzzling me.
Thanks in advance!
scale = max(abs(x), abs(y))
x = x / scale
y = y / scale
This is the simple case, for a square from (-1, -1) to (1, 1). If you want a different sized square, multiply the coordinates by sidelen / 2.
If you want a rectangle instead of a square, use the following formula. (This is another solution to the arbitrarily-sized square version)
scale = max(abs(x) / (width / 2), abs(y) / (height / 2))
x = x / scale
y = y / scale
Let's call the length of one side of the square l. The slope of the line is -y/x. That means, if you move along the line and rise a distance y toward the top of the square, then you'll move a distance x to the left. But since the green point is at the center of the square, you can rise only l/2. You can express this as a ratio:
-y -l/2
——— = ———
x d
Where d is the distance you'll move to the left. Solving for d, we have
d = xl/2y
So if the green dot is at (0, 0), the red dot is at (-l/2, xl/2y).
All you need is the angle and the width of the square w.
If the green dot is at (0,0), then the angle is a = atan(y/x), the y-coordinate of the dot is w/2, and therefore the x-coordinate of the dot is tan(1/a) * (w/2). Note that tan(1/a) == pi/2 - tan(a), or in other words the angle you really want to plug into tan is the one outside the box.
Edit: yes, this can be done without trig, too. All you need is to interpolate the x-coordinate of the dot on the line. So you know the y-coordinate is w/2, then the x-coordinate is (w/2) * x/y. But, be careful which quadrant of the square you're working with. That formula is only valid for -y<x<y, otherwise you want to reverse x and y.
I've got a shape consisting of four points, A, B, C and D, of which the only their position is known. The goal is to transform these points to have specific angles and offsets relative to each other.
For example: A(-1,-1) B(2,-1) C(1,1) D(-2,1), which should be transformed to a perfect square (all angles 90) with offsets between AB, BC, CD and AD all being 2. The result should be a square slightly rotated counter-clockwise.
What would be the most efficient way to do this?
I'm using this for a simple block simulation program.
As Mark alluded, we can use constrained optimization to find the side 2 square that minimizes the square of the distance to the corners of the original.
We need to minimize f = (a-A)^2 + (b-B)^2 + (c-C)^2 + (d-D)^2 (where the square is actually a dot product of the vector argument with itself) subject to some constraints.
Following the method of Lagrange multipliers, I chose the following distance constraints:
g1 = (a-b)^2 - 4
g2 = (c-b)^2 - 4
g3 = (d-c)^2 - 4
and the following angle constraints:
g4 = (b-a).(c-b)
g5 = (c-b).(d-c)
A quick napkin sketch should convince you that these constraints are sufficient.
We then want to minimize f subject to the g's all being zero.
The Lagrange function is:
L = f + Sum(i = 1 to 5, li gi)
where the lis are the Lagrange multipliers.
The gradient is non-linear, so we have to take a hessian and use multivariate Newton's method to iterate to a solution.
Here's the solution I got (red) for the data given (black):
This took 5 iterations, after which the L2 norm of the step was 6.5106e-9.
While Codie CodeMonkey's solution is a perfectly valid one (and a great use case for the Lagrangian Multipliers at that), I believe that it's worth mentioning that if the side length is not given this particular problem actually has a closed form solution.
We would like to minimise the distance between the corners of our fitted square and the ones of the given quadrilateral. This is equivalent to minimising the cost function:
f(x1,...,y4) = (x1-ax)^2+(y1-ay)^2 + (x2-bx)^2+(y2-by)^2 +
(x3-cx)^2+(y3-cy)^2 + (x4-dx)^2+(y4-dy)^2
Where Pi = (xi,yi) are the corners of the fitted square and A = (ax,ay) through D = (dx,dy) represent the given corners of the quadrilateral in clockwise order. Since we are fitting a square we have certain contraints regarding the positions of the four corners. Actually, if two opposite corners are given, they are enough to describe a unique square (save for the mirror image on the diagonal).
Parametrization of the points
This means that two opposite corners are enough to represent our target square. We can parametrise the two remaining corners using the components of the first two. In the above example we express P2 and P4 in terms of P1 = (x1,y1) and P3 = (x3,y3). If you need a visualisation of the geometrical intuition behind the parametrisation of a square you can play with the interactive version.
P2 = (x2,y2) = ( (x1+x3-y3+y1)/2 , (y1+y3-x1+x3)/2 )
P4 = (x4,y4) = ( (x1+x3+y3-y1)/2 , (y1+y3+x1-x3)/2 )
Substituting for x2,x4,y2,y4 means that f(x1,...,y4) can be rewritten to:
f(x1,x3,y1,y3) = (x1-ax)^2+(y1-ay)^2 + ((x1+x3-y3+y1)/2-bx)^2+((y1+y3-x1+x3)/2-by)^2 +
(x3-cx)^2+(y3-cy)^2 + ((x1+x3+y3-y1)/2-dx)^2+((y1+y3+x1-x3)/2-dy)^2
a function which only depends on x1,x3,y1,y3. To find the minimum of the resulting function we then set the partial derivatives of f(x1,x3,y1,y3) equal to zero. They are the following:
df/dx1 = 4x1-dy-dx+by-bx-2ax = 0 --> x1 = ( dy+dx-by+bx+2ax)/4
df/dx3 = 4x3+dy-dx-by-bx-2cx = 0 --> x3 = (-dy+dx+by+bx+2cx)/4
df/dy1 = 4y1-dy+dx-by-bx-2ay = 0 --> y1 = ( dy-dx+by+bx+2ay)/4
df/dy3 = 4y3-dy-dx-2cy-by+bx = 0 --> y3 = ( dy+dx+by-bx+2cy)/4
You may see where this is going, as simple rearrangment of the terms leads to the final solution.
Final solution