flip with any degree of angle? - pytorch

This is the one of python example of flip the image with the corresponding box
is it possible to write this as flip 20 degree angle or 20,25 or 45 degree angle?
how to change this code to make any degree of angles???
import torchvision.transforms.functional as FT
def flip(image, boxes):
Flip image horizontally.
:param image: image, a PIL Image
:param boxes: bounding boxes in boundary coordinates, a tensor of dimensions (n_objects, 4)
:return: flipped image, updated bounding box coordinates
# Flip image
new_image = FT.hflip(image)
# Flip boxes
new_boxes = boxes
new_boxes[:, 0] = image.width - boxes[:, 0] - 1
new_boxes[:, 2] = image.width - boxes[:, 2] - 1
new_boxes = new_boxes[:, [2, 1, 0, 3]]
return new_image, new_boxes

You can use torchvision.transforms.functional.rotate to rotate image by an angle
import torchvision.transforms.functional as FT
image = FT.rotate(image, angle)
You can also randomly select angle either using
import random
# randonly select angle b/w 20 and 45
angle = random.randint(20, 45)
or using torchvision.transforms.RandomRotation
select random rotation b/w min and max degree
loader_transform = torchvision.transforms.RandomRotation(degrees=(min, max))
img = loader_transform(img)


Threshold in a superpixel opencv2

I'm trying to use cv2 module to receive pixel coordinates of relatively dark regions in an image.
First I divide it into super-pixels through the cv2.ximgproc.createSuperpixelSLIC() method.
Then I'd like to consider each super-pixel as a ROI, and threshold it based on its' the intensity, so that the darker regions (i.e., where the intensity is lower than some preconfigured threshold) will be 1, and 0 in regions where the intensity is relatively high (i.e., larger than this threshold).
I tried the following code, but the problem is that is highlights the background (as obviously it also dark).
import cv2
import numpy as np
# Parameters
RULER = 20
N = 10
# ---
# 1) Load the image
img = cv2.imread(IMG_FILE_PATH, cv2.IMREAD_GRAYSCALE)
# 2) Compute the superpixels
slic = cv2.ximgproc.createSuperpixelSLIC(img, region_size=REGION_SIZE, ruler=RULER)
# 3) Get the characteristics of the superpixels calculated
lbls = slic.getLabels()
num_slic = slic.getNumberOfSuperpixels()
# 4) Sample some of the superpixels
sample_idxs = np.random.choice(np.arange(num_slic), size=SAMPLE_SIZE, replace=False)
for idx in sample_idxs:
img_super_pixel = np.uint8(img * (lbls==idx).astype(np.int16))
ret, mask_fg = cv2.threshold(img_super_pixel, INTENSITY_TH, 255, cv2.THRESH_BINARY)
img_super_pixel_th = cv2.bitwise_and(img_super_pixel, img_super_pixel, mask=mask_fg)
cv2.imshow('Super-pixel', img_super_pixel)
cv2.imshow('Super-pixel - thresholded', img_super_pixel_th)
Here is a sample image:
Current Output Example:
So, as is seen - the background is represented with 1., obviously because it is less than the threshold, but what I need is that that only the black spots in the super-pixel would be white, and the background with the pixels which exceed the threshold in the super-pixel area, would be black.
Is there a way to apply threshold only on the ROI, viz. the super-pixel, and not on the background?
Thanks in advance.
I was able to solve this by manually checking the pixels in the region which are below the threshold, as shown in the following code:
import cv2
import numpy as np
import pandas as pd
from pathlib import Path
# Parameters
RULER = 20
N = 10
# ---
# 1) Load the image
img = cv2.imread(IMG_FILE_PATH, cv2.IMREAD_GRAYSCALE)
# 2) Compute the superpixels
slic = cv2.ximgproc.createSuperpixelSLIC(img, region_size=REGION_SIZE, ruler=RULER)
# 3) Get the characteristics of the superpixels calculated
mask_slic = slic.getLabelContourMask()
lbls = slic.getLabels()
num_slic = slic.getNumberOfSuperpixels()
# 4) Sample some of the superpixels
sample_idxs = np.random.choice(np.arange(num_slic), size=SAMPLE_SIZE, replace=False)
for idx in sample_idxs:
# 4.1) Create pandas.DataFrame to store the points and their validity based on the threshold
sp_pixels_df = pd.DataFrame(columns=['x', 'y', 'relevant'])
# 4.2) Get the current super-pixel
img_super_pixel = np.uint8(img * (lbls==idx).astype(np.int16))
# 4.3) Find the coordinates of the pixels inside the current super-pixel
img_super_pixel_idx = np.argwhere(lbls==idx)
# 4.4) Separate the x and y coordinates of the points which are located inside the superpixel
xs, ys = np.array([t[0] for t in img_super_pixel_idx]), np.array([t[1] for t in img_super_pixel_idx])
# 4.5) Find the pixels inside the superpixel, which intensity is below the threshold
low_intensity_pixels = img_super_pixel[tuple([xs, ys])] < INTENSITY_TH
# 4.6) Populate the pandas.DataFrame
sp_pixels_df['x'] = xs
sp_pixels_df['y'] = ys
sp_pixels_df['relevant'] = low_intensity_pixels
# 4.7) Get the valid pixel coordinates
relevant_points = sp_pixels_df.loc[sp_pixels_df.relevant, ['x', 'y']].values
# 4.8) Separate the x and y coordinates of the relevant points which are located inside the superpixel
relevant_xs, relevant_ys = np.array([t[0] for t in relevant_points]), np.array([t[1] for t in relevant_points])
# 4.9) Convert the gray-scale image to BGR to be able to mark the relevant pixels in red
img_super_pixel_highlighted = cv2.cvtColor(img_super_pixel, cv2.COLOR_GRAY2BGR)
# 4.10) Highlight the relevant pixels
img_super_pixel_highlighted[tuple([relevant_xs, relevant_ys])] = (0, 0, 255)
cv2.imshow('Original Superpixels', img_super_pixel)
cv2.imshow('Relevant pixels highlighted', img_super_pixel_highlighted)

How can i get the inner contour points without redundancy in OpenCV - Python

I'm new with OpenCV and the thing is that i need to get all the contour points. This is easy setting the cv2.RETR_TREE mode in findContours method. The thing is that in this way, returns redundant coordinates. So, for example, in this polygon, i don't want to get the contour points like this:
But like this:
So according to the first image, green color are the contours detected with RETR_TREE mode, and points 1-2, 3-5, 4-6, ... are redundant, because they are so close to each other. I need to put together those redundant points into one, and append it in the customContours array.
For the moment, i only have the code according for the first picture, setting up the distance between the points and the points coordinates:
def getContours(img, minArea=20000, cThr=[100, 100]):
imgColor = img
imgGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
imgBlur = cv2.GaussianBlur(imgGray, (5, 5), 1)
imgCanny = cv2.Canny(imgBlur, cThr[0], cThr[1])
kernel = np.ones((5, 5))
imgDial = cv2.dilate(imgCanny, kernel, iterations=3)
imgThre = cv2.erode(imgDial, kernel, iterations=2)
cv2.imshow('threshold', imgThre)
contours, hierachy = cv2.findContours(imgThre, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
customContours = []
for cnt in contours:
area = cv2.contourArea(cnt)
if area > minArea:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, 0.009*peri, True)
bbox = cv2.boundingRect(approx)
customContours.append([len(approx), area, approx, bbox, cnt])
print('points: ', len(approx))
n = approx.ravel()
i = 0
for j in n:
if i % 2 == 0:
x = n[i]
y = n[i + 1]
string = str(x)+" " + str(y)
cv2.putText(imgColor, str(i//2+1) + ': ' + string, (x, y), font, 2, (0, 0, 0), 2)
i = i + 1
customContours = sorted(customContours, key=lambda x: x[1], reverse=True)
for cnt in customContours:
cv2.drawContours(imgColor, [cnt[2]], 0, (0, 0, 255), 5)
return imgColor, customContours
Could you help me to get the real points regarding to i.e. the second picture?
(EDIT 01/07/21)
I want a generic solution, because the image could be more complex, such as the following picture:
NOTE: notice that the middle arrow (points 17 and 18) doesn't have a closed area, so isn't a polygon to study. Then, that region is not interested to obtain his points. Also, notice that the order of the points aren't important, but if the entry is the hole image, it should know that there are 4 polygons, so for each polygon points starts with 0, then 1, etc.
Here's my approach. It is mainly morphological-based. It involves convolving the image with a special kernel. This convolution identifies the end-points of the triangle as well as the intersection points where the middle line is present. This will result in a points mask containing the pixel that matches the points you are looking for. After that, we can apply a little bit of morphology to join possible duplicated points. What remains is to get a list of the coordinate of these points for further processing.
These are the steps:
Get a binary image of the input via Otsu's thresholding
Get the skeleton of the binary image
Define the special kernel and convolve the skeleton image
Apply a morphological dilate to join possible duplicated points
Get the centroids of the points and store them in a list
Here's the code:
# Imports:
import numpy as np
import cv2
# image path
path = "D://opencvImages//"
fileName = "triangle.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Prepare a deep copy for results:
inputImageCopy = inputImage.copy()
# Convert BGR to Grayscale
grayImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu:
_, binaryImage = cv2.threshold(grayImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
The first bit computes the binary image. Very straightforward. I'm using this image as base, which is just a cleaned-up version of what you posted without the annotations. This is the resulting binary image:
Now, to perform the convolution we must first get the image "skeleton". The skeleton is a version of the binary image where lines have been normalized to have a width of 1 pixel. This is useful because we can then convolve the image with a 3 x 3 kernel and look for specific pixel patterns. Let's compute the skeleton using OpenCV's extended image processing module:
# Get image skeleton:
skeleton = cv2.ximgproc.thinning(binaryImage, None, 1)
This is the image obtained:
We can now apply the convolution. The approach is based on Mark Setchell's info on this post. The post mainly shows the method for finding end-points of a shape, but I extended it to also identify line intersections, such as the middle portion of the triangle. The main idea is that the convolution yields a very specific value where patterns of black and white pixels are found in the input image. Refer to the post for the theory behind this idea, but here, we are looking for two values: 110 and 40. The first one occurs when an end-point has been found. The second one when a line intersections is found. Let's setup the convolution:
# Threshold the image so that white pixels get a value of 0 and
# black pixels a value of 10:
_, binaryImage = cv2.threshold(skeleton, 128, 10, cv2.THRESH_BINARY)
# Set the convolution kernel:
h = np.array([[1, 1, 1],
[1, 10, 1],
[1, 1, 1]])
# Convolve the image with the kernel:
imgFiltered = cv2.filter2D(binaryImage, -1, h)
# Create list of thresholds:
thresh = [110, 40]
The first part is done. We are going to detect end-points and intersections in two separated steps. Each step will produce a partial result, we can OR both results to get a final mask:
# Prepare the final mask of points:
(height, width) = binaryImage.shape
pointsMask = np.zeros((height, width, 1), np.uint8)
# Perform convolution and create points mask:
for t in range(len(thresh)):
# Get current threshold:
currentThresh = thresh[t]
# Locate the threshold in the filtered image:
tempMat = np.where(imgFiltered == currentThresh, 255, 0)
# Convert and shape the image to a uint8 height x width x channels
# numpy array:
tempMat = tempMat.astype(np.uint8)
tempMat = tempMat.reshape(height,width,1)
# Accumulate mask:
pointsMask = cv2.bitwise_or(pointsMask, tempMat)
This is the final mask of points:
Note that the white pixels are the locations that matched our target patterns. Those are the points we are looking for. As the shape is not a perfect triangle, some points could be duplicated. We can "merge" neighboring blobs by applying a morphological dilation:
# Set kernel (structuring element) size:
kernelSize = 7
# Set operation iterations:
opIterations = 3
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform Dilate:
morphoImage = cv2.morphologyEx(pointsMask, cv2.MORPH_DILATE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
This is the result:
Very nice, we have now big clusters of pixels (or blobs). To get their coordinates, one possible approach would be to get the bounding rectangles of these contours and compute their centroids:
# Look for the outer contours (no children):
contours, _ = cv2.findContours(morphoImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# Store the points here:
pointsList = []
# Loop through the contours:
for i, c in enumerate(contours):
# Get the contours bounding rectangle:
boundRect = cv2.boundingRect(c)
# Get the centroid of the rectangle:
cx = int(boundRect[0] + 0.5 * boundRect[2])
cy = int(boundRect[1] + 0.5 * boundRect[3])
# Store centroid into list:
pointsList.append( (cx,cy) )
# Set centroid circle and text:
color = (0, 0, 255)
cv2.circle(inputImageCopy, (cx, cy), 3, color, -1)
string = str(cx) + ", " + str(cy)
cv2.putText(inputImageCopy, str(i) + ':' + string, (cx, cy), font, 0.5, (255, 0, 0), 1)
# Show image:
cv2.imshow("Circles", inputImageCopy)
These are the points located in the original input:
Note also that I've stored their coordinates in the pointsList list:
# Print the list of points:
This prints the centroids as the tuple (centroidX, centroidY):
[(717, 971), (22, 960), (183, 587), (568, 586), (388, 98)]

How to set HoughCircles parameters automatically to detect different sizes of circles in opencv python?

I want to set HoughCircles parameters automatically to detect all size of circles in an image. And also should detect group of same size circles.
I am trying group of same size circles in one image. And group of same size circles in different image, the sizes of circles in both image are different.
So how to set HoughCircles parameters automatically that can detect group of circles in any image.
please help me.
Thank u
If you're looking to collectively just "bin" same-size circles, the below should serve as a good starting point that can be tweaked for your application.
import cv2
import numpy as np
img = cv2.imread('C:\\Test\\circles.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
circles = cv2.HoughCircles(gray, cv2.HOUGH_GRADIENT, dp=2.0, minDist=50, minRadius=20, maxRadius=250)
radius_map = {}
for n in range(20, 250, 1):
radius_map[n] = []
if circles is not None:
circles = np.round(circles[0, :]).astype("int")
for (x, y, r) in circles:
radius_map[r].append((x, y, r))
for key in radius_map:
if len(radius_map[key]) > 0:
output = img.copy()
for x, y, r in radius_map[key]:
cv2.circle(output, (x, y), r, (0, 255, 0), 4)
cv2.imshow(f"Radius {key}", output)
If you require some thresholded band of say, circles with radius 50 and 51 are considered the same size, you can iterate over the radius_map dict object and group radius bins together.
Input Image:
Output Images:

How do I find corners of a paper when there are printed corners/lines on paper itself?

I'm using openCV in Python to find the corners of a sheet of paper to unwarp it.
img = cv2.imread(images[i])
corners = cv2.goodFeaturesToTrack(cv2.cvtColor(img,cv2.COLOR_BGR2GRAY),4,.01,1000,useHarrisDetector=True,k=.04)
corners = np.float32(corners)
ratio = 1.6
cardH = math.sqrt((corners[2][0][0] - corners[1][0][0]) * (corners[2][0][0] - corners[1][0][0]) + (corners[2][0][1] - corners[1][0][1]) * (
corners[2][0][1] - corners[1][0][1]))
cardW = ratio * cardH;
pts2 = np.float32(
[[corners[0][0][0], corners[0][0][1]], [corners[0][0][0] + cardW, corners[0][0][1]], [corners[0][0][0] + cardW, corners[0][0][1] + cardH],
[corners[0][0][0], corners[0][0][1] + cardH]])
M = cv2.getPerspectiveTransform(corners, pts2)
offsetSize = 500
transformed = np.zeros((int(cardW + offsetSize), int(cardH + offsetSize)), dtype=np.uint8);
dst = cv2.warpPerspective(img, M, transformed.shape)
As you can see with these images, they're detecting edges inside the paper itself, rather than the corner of the paper. Should I consider using a different algorithm entirely? I'm quite lost.
I've tried increasing the minimum euclidean distance to 1000, but that really didn't do anything.
Please note, this no one's real information, this is a fake dataset found on Kaggle.
The kaggle dataset can be found https://www.kaggle.com/mcvishnu1/fake-w2-us-tax-form-dataset
Here is one way to do that in Python/OpenCV.
Note that the found corners are listed counter-clockwise from the top-most corner.
Read the input
Convert to gray
Gaussian blur
Otsu threshold
Morphology open/close to clean up the threshold
Get largest contour
Approximate a polygon from the contour
Get the corners
Draw the polygon on the input
Compute side lengths
Compute output corresponding corners
Get perspective transformation matrix from corresponding corner points
Warp the input image according to the matrix
Save the results
import cv2
import numpy as np
# read image
img = cv2.imread("efile.jpg")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# blur image
blur = cv2.GaussianBlur(gray, (3,3), 0)
# do otsu threshold on gray image
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]
# apply morphology
kernel = np.ones((7,7), np.uint8)
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
morph = cv2.morphologyEx(morph, cv2.MORPH_OPEN, kernel)
# get largest contour
contours = cv2.findContours(morph, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
area_thresh = 0
for c in contours:
area = cv2.contourArea(c)
if area > area_thresh:
area_thresh = area
big_contour = c
# draw white filled largest contour on black just as a check to see it got the correct region
page = np.zeros_like(img)
cv2.drawContours(page, [big_contour], 0, (255,255,255), -1)
# get perimeter and approximate a polygon
peri = cv2.arcLength(big_contour, True)
corners = cv2.approxPolyDP(big_contour, 0.04 * peri, True)
# draw polygon on input image from detected corners
polygon = img.copy()
cv2.polylines(polygon, [corners], True, (0,0,255), 1, cv2.LINE_AA)
# Alternate: cv2.drawContours(page,[corners],0,(0,0,255),1)
# print the number of found corners and the corner coordinates
# They seem to be listed counter-clockwise from the top most corner
# for simplicity get average of top/bottom side widths and average of left/right side heights
# note: probably better to get average of horizontal lengths and of vertical lengths
width = 0.5*( (corners[0][0][0] - corners[1][0][0]) + (corners[3][0][0] - corners[2][0][0]) )
height = 0.5*( (corners[2][0][1] - corners[1][0][1]) + (corners[3][0][1] - corners[0][0][1]) )
width = np.int0(width)
height = np.int0(height)
# reformat input corners to x,y list
icorners = []
for corner in corners:
pt = [ corner[0][0],corner[0][1] ]
icorners = np.float32(icorners)
# get corresponding output corners from width and height
ocorners = [ [width,0], [0,0], [0,height], [width,height] ]
ocorners = np.float32(ocorners)
# get perspective tranformation matrix
M = cv2.getPerspectiveTransform(icorners, ocorners)
# do perspective
warped = cv2.warpPerspective(img, M, (width, height))
# write results
cv2.imwrite("efile_thresh.jpg", thresh)
cv2.imwrite("efile_morph.jpg", morph)
cv2.imwrite("efile_polygon.jpg", polygon)
cv2.imwrite("efile_warped.jpg", warped)
# display it
cv2.imshow("efile_thresh", thresh)
cv2.imshow("efile_morph", morph)
cv2.imshow("efile_page", page)
cv2.imshow("efile_polygon", polygon)
cv2.imshow("efile_warped", warped)
Thresholded image:
Morphology cleaned image:
Polygon drawn on input:
Extracted Corners (counterclockwise from top right corner)
[[[693 67]]
[[ 23 85]]
[[ 62 924]]
[[698 918]]]
Warped Result:

Rotating 2D grayscale image with transformation matrix

I am new to image processing so i am really confused regarding the coordinate system with images. I have a sample image and i am trying to rotate it 45 clockwise. My transformation matrix is T = [ [cos45 sin45] [-sin45 cos45] ]
Here is the code:
import numpy as np
from matplotlib import pyplot as plt
from skimage import io
image = io.imread('sample_image')
img_transformed = np.zeros((image.shape), dtype=np.uint8)
trans_matrix = np.array([[np.cos(45), np.sin(45)], [-np.sin(45), np.cos(45)]])
for i, row in enumerate(image):
for j,col in enumerate(row):
pixel_data = image[i,j] #get the value of pixel at corresponding location
input_coord = np.array([i, j]) #this will be my [x,y] matrix
result = trans_matrix # input_coord
i_out, j_out = result #store the resulting coordinate location
#make sure the the i and j values remain within the index range
if (0 < int(i_out) < image.shape[0]) and (0 < int(j_out) < image.shape[1]):
img_transformed[int(i_out)][int(j_out)] = pixel_data
plt.imshow(img_transformed, cmap='gray')
The image comes out distorted and doesn't seems right. I know that in pixel coordinate, the origin is at the top left corner (row, column). is the rotation happening with respect to origin from the top left corner? is there a way to shift origin to center or any other given point?
Thank you all!
Yes, as you suspect, the rotation is happening with respect to the top left corner, which has coordinates (0, 0). (Also: the NumPy trigonometric functions use radians rather than degrees, so you need to convert your angle.) To compute a rotation with respect to the center, you do a little hack: you compute the transformation for moving the image so that it is centered on (0, 0), then you rotate it, then you move the result back. You need to combine these transformations in a sequence because if you do it one after the other, you'll lose everything in negative coordinates.
It's much, much easier to do this using Homogeneous coordinates, which add an extra "dummy" dimension to your image. Here's what your code would look like in homogeneous coordinates:
import numpy as np
from matplotlib import pyplot as plt
from skimage import io
image = io.imread('sample_image')
img_transformed = np.zeros((image.shape), dtype=np.uint8)
c, s = np.cos(np.radians(45)), np.sin(np.radians(45))
rot_matrix = np.array([[c, s, 0], [-s, c, 0], [0, 0, 1]])
x, y = np.array(image.shape) // 2
# move center to (0, 0)
translate1 = np.array([[1, 0, -x], [0, 1, -y], [0, 0, 1]])
# move center back to (x, y)
translate2 = np.array([[1, 0, x], [0, 1, y], [0, 0, 1]])
# compose all three transformations together
trans_matrix = translate2 # rot_matrix # translate1
for i, row in enumerate(image):
for j,col in enumerate(row):
pixel_data = image[i,j] #get the value of pixel at corresponding location
input_coord = np.array([i, j, 1]) #this will be my [x,y] matrix
result = trans_matrix # input_coord
i_out, j_out, _ = result #store the resulting coordinate location
#make sure the the i and j values remain within the index range
if (0 < int(i_out) < image.shape[0]) and (0 < int(j_out) < image.shape[1]):
img_transformed[int(i_out)][int(j_out)] = pixel_data
plt.imshow(img_transformed, cmap='gray')
The above should work ok, but you will probably get some black spots due to aliasing. What can happen is that no coordinates i, j from the input land exactly on an output pixel, so that pixel never gets updated. Instead, what you need to do is iterate over the pixels of the output image, then use the inverse transform to find which pixel in the input image maps closest to that output pixel. Something like:
inverse_tform = np.linalg.inv(trans_matrix)
for i, j in np.ndindex(img_transformed.shape):
i_orig, j_orig, _ = np.round(inverse_tform # [i, j, 1]).astype(int)
if i_orig in range(image.shape[0]) and j_orig in range(image.shape[1]):
img_transformed[i, j] = image[i_orig, j_orig]
Hope this helps!
