Need to find the images are same , even if it has different resolution and size. No need of a pixel to pixel comparison, need to find all the images , texts, color etc. are same in the other image also.
Tried with different python packages to compare, but all are asking for same resolution. One of my image is screenshotted from Mac and other is from Ubuntu. Even both are same html, the contrast and resolution difference of the machines causing the images to become different when compare.
Tried.
Perceptual diff,
Image Hash etc.
• PDIFFER – Python wrapper for perceptualdiff tool (https://pypi.org/project/pdiffer/)
Problem - pip install pdiffer failed to install in the Macs for the latest as well as old versions
• NEEDLE - Installed needle (https://needle.readthedocs.io/en/latest/) This one has an option to specify comparison engine to be perceptualdiff/Imagemagick instead of default PIL.
Problem Found - Has an option to save baseline image first and then run assertions on them. It works when baseline is saved and then compared. I didn’t find anything that would compare the screenshots with existing images.
• OPENCV – Histogram based comparison. This converts images into grayscale, and into histograms and compares the histograms. Returns a value between -1 and 1 (-1 means not similar at all and 1 means highly similar) (https://www.pyimagesearch.com/2014/07/14/3-ways-compare-histograms-using-opencv-python/ )
Findings - I tested two images by converting into histograms and compared them Which returned a value of 0.8 (meaning somewhat similar).
Below code i tried using imagehash:
from PIL import Image import imagehash
image_one = 'result.png'
img = Image.open(image_one) image_one_hash = imagehash.whash(img)
print(image_one_hash)
image_two = 'not-found-02.png'
img2 = Image.open(image_two) image_two_hash = imagehash.whash(img2)
print(image_two_hash)
similarity = image_one_hash - image_two_hash print(similarity)
Related
im using python 3.10.5
and heres my code
import pyautogui
target = pyautogui.locateCenterOnScreen('target.png')
print(target)
pyautogui.moveTo(target)
but for some reason it just prints None
and doesnt move the mouse to the images coordinates
The fact that it prints None means it didn't find the image. Check the image you are trying to find (is it properly cropped?) or try setting the confidence parameter to make a match more likely.
pyautogui.locateCenterOnScreen('target.png', confidence=x)
# x can be anywhere between 1 and 0, the lower the more likely a match
import pyautogui
target = pyautogui.locateCenterOnScreen('target.png', confidence = 0.5)
#start at 0.5 and then scale as needed.
print(target.x,target.y)
pyautogui.moveTo(target.x,target.y)
when returning location from locateCenterOnScreen, it needs coordinates, I use this one extensively, with locateonscreen to check for validity first, then the former for actual utility. I use .sleep() extensively as the target applications usually don't respond at computer speed.
I was trying to use pytesseract to find the box positions of each letter in an image. I tried to use an image, and cropping it with Pillow and it worked, but when I tried with a lower character size image (example), the program may recognize the characters, but cropping the image with the box coordinates give me images like this. I also tried to double up the size of the original image, but it changed nothing.
img = Image.open('imgtest.png')
data=pytesseract.image_to_boxes(img)
dati= data.splitlines()
corde=[]
for i in dati[0].split()[1:5]: #just trying with the first character
corde.append(int(i))
im=img.crop(tuple(corde))
im.save('cimg.png')
If we stick to the source code of image_to_boxes, we see, that the returned coordinates are in the following order:
left bottom right top
From the documentation on Image.crop, we see, that the expected order of coordinates is:
left upper right lower
Now, it also seems, that pytesseract iterates images from bottom to top. Therefore, we also need to further convert the top/upper and bottom/lower coordinates.
That'd be the reworked code:
from PIL import Image
import pytesseract
img = Image.open('MJwQi9f.png')
data = pytesseract.image_to_boxes(img)
dati = data.splitlines()
corde = []
for i in dati[0].split()[1:5]:
corde.append(int(i))
corde = tuple([corde[0], img.size[1]-corde[3], corde[2], img.size[1]-corde[1]])
im = img.crop(tuple(corde))
im.save('cimg.png')
You see, left and right are in the same place, but top/upper and bottom/lower switched places, and where also altered w.r.t. the image height.
And, that's the updated output:
The result isn't optimal, but I assume, that's due to the font.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
Pillow: 8.1.0
pytesseract: 4.00.00alpha
----------------------------------------
I am working on a digit recognition task using Tesseract and OpenCV. I did use it and came to a solution that is specific for a particular image. If I change my image I do not obtain correct results. I should change the threshold value according to the image. The steps I did was:
Cropping the image to find an appropriate region
Change image into grayscale
Using Gaussian Blur
Taking appropriate threshold
passing the image through Tesseract
So, My question is how can I make my code generic i.e. it can be used for different images without updating my code.
While working on this image I processed as`
imggray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgBlur=cv2.GaussianBlur(imggray,(5,5), 0)
imgDil=cv2.dilate(imgBlur,np.ones((5,5),np.uint8),iterations=1)
imgEro=cv2.erode(imgDil,np.ones((5,5),np.uint8),iterations=2)
ret,imgthresh=cv2.threshold(imgEro,28,255, cv2.THRESH_BINARY )
And for this Image as
imggray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgBlur=cv2.GaussianBlur(imggray,(5,5), 0)
imgDil=cv2.dilate(imgBlur,np.ones((5,5),np.uint8),iterations=0)
imgEro=cv2.erode(imgDil,np.ones((5,5),np.uint8),iterations=0)
ret,imgthresh=cv2.threshold(imgEro,37,255, cv2.THRESH_BINARY )
I had to change the value of iterations and the minimum threshold to obtain proper results. What can be the solution so I should not change the values?
I'm currently trying to perform a Polar to Cartesian Coordinate Image transformation, to display a raw sonar image into a 'fan-display'.
Initially I have a Numpy Array image of type np.float64, that can be seen below:
After doing some searching, I came across this StackOverflow post Inverse transform an image from Polar to Cartesian in OpenCV with a very similar problem, in which the poster seemed to have solved his/her issue by using the Python Wand library (http://docs.wand-py.org/en/0.5.9/index.html), specifically using their set of Distortion functions.
However, when I tried to use Wand and read the image in, I instead ended up with Wand getting the image below, which seems to be smaller than the original one. However, the weird thing is that img.size still gives the same size number as the original image's shape.
The code for this transformation can be seen below:
print(raw_img.shape)
wand_img = Image.from_array(raw_img.astype(np.uint8), channel_map="I") #=> (369, 256)
display(wand_img)
print("Current image size", wand_img.size) #=> "Current image size (369, 256)"
This is definitely quite problematic as Wand will automatically give the wrong 'fan image'. Is anybody familiar with this kind of problem with the Wand library previously, and if yes, may I ask what is the recommended solution to fix this issue?
If this issue isn't resolved soon I have an alternative backup of using OpenCV's cv::remap function (https://docs.opencv.org/4.1.2/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121). However the problem with this is that I'm not sure what mapping arrays (i.e. map_x and map_y) to use to perform the Polar->Cartesian transformation, as using a mapping matrix that implements the transformation equations below:
r = polar_distances(raw_img)
x = r * cos(theta)
y = r * sin(theta)
didn't seem to work and instead threw out errors from OpenCV as well.
Any kind of help and insight into this issue is greatly appreciated. Thank you!
- NickS
EDIT I've tried on another image example as well, and it still shows a similar problem. So first, I imported the image into Python using OpenCV, using these lines of code:
import matplotlib.pyplot as plt
from wand.image import Image
from wand.display import display
import cv2
img = cv2.imread("Test_Img.jpg")
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.figure()
plt.imshow(img_rgb)
plt.show()
which showed the following display as a result:
However, as I continued and tried to open the img_rgb object with Wand, using the code below:
wand_img = Image.from_array(img_rgb)
display(img_rgb)
I'm getting the following result instead.
I tried to open the image using wand.image.Image() on the file directly, which is able to display the image correctly when using display() function, so I believe that there isn't anything wrong with the wand library installation on the system.
Is there a missing step that I required to convert the numpy into Wand Image that I'm missing? If so, what would it be and what is the suggested method to do so?
Please do keep in mind that I'm stressing the conversion of Numpy to Wand Image quite crucial, the raw sonar images are stored as binary data, thus the required use of Numpy to convert them to proper images.
Is there a missing step that I required to convert the numpy into Wand Image that I'm missing?
No, but there is a bug in Wand's Numpy implementation in Wand 0.5.x. The shape of OpenCV's ndarray is (ROWS, COLUMNS, CHANNELS), but Wand's ndarray is (WIDTH, HEIGHT, CHANNELS). I believe this has been fixed for the future 0.6.x releases.
If so, what would it be and what is the suggested method to do so?
Swap the values in img_rgb.shape before passing to Wand.
img_rgb.shape = (img_rgb.shape[1], img_rgb.shape[0], img_rgb.shape[2],)
with Image.from_array(img_rgb) as img:
display(img)
I recently ran across the PhotUtils package and am trying to use it to perform PSF Photometry on some images I have. However, when I try to run the code, I get very strange results. When I plot the image generated by get_residual_image(), the stars are not removed well. Some sample images are shown below.
The first image has sigma set to 2.05, as it is in one of the sample programs in the PhotUtils documentation:
However, the stars only appear to be removed in their center.
The second image has sigma set to 5.0. This one is especially strange. Some stars are way over-removed, some are under removed, some black squares are added to the image, etc.
Here is my code:
import photutils
from photutils.psf import DAOPhotPSFPhotometry as DAOP
from photutils.psf import IntegratedGaussianPRF as PRF
from photutils.background import MMMBackground
bkg = MMMBackground()
background = 2.5*bkg(img)
gaussian_prf = PRF(sigma=5.0)
gaussian_prf.sigma.fixed = False
photTester = DAOP(8,background,5,gaussian_prf,31)
photResults = photTester(imgStars)
finalImg = photTester.get_residual_image()
After this, I simply plot the original and final image in MatPlotLib. I use a greyscale colormap. The reason that the left images appear slightly darker is that they use a different color scaling.
Perhaps I have set one of the parameters incorrectly?
Could someone help me out with this? Thank you!
Looking at the residual image instantly told me that the background subtraction might be wrong. I could reproduce the result and wondered, if MMMBackground did not do the job correctly.
After taking a closer look at the documentation, Getting startet with Photutils finally gave the essential hint:
image -= np.median(image)