I'm Marius, a maths student in the first year.
We have recieved a team-assignment where we have to implement a fourier transformation and we chose to try to encode the transformation of an image to a JPEG image.
to simplify the problem for ourselves, we chose to do it only for pictures that are greyscaled.
This is my code so far:
from PIL import Image
import numpy as np
import sympy as sp
#
#ALLEMAAL INFORMATIE GEEN BEREKENINGEN
img = Image.open('mario.png')
img = img.convert('L') # convert to monochrome picture
img.show() #opens the picture
pixels = list(img.getdata())
print(pixels) #to see if we got the pixel numeric values correct
grootte = list(img.size)
print(len(pixels)) #to check if the amount of pixels is correct.
kolommen, rijen = img.size
print("het aantal kolommen is",kolommen,"het aantal rijen is",rijen)
#tot hier allemaal informatie
pixelMatrix = []
while pixels != []:
pixelMatrix.append(pixels[:kolommen])
pixels = pixels[kolommen:]
print(pixelMatrix)
pixelMatrix = np.array(pixelMatrix)
print(pixelMatrix.shape)
Now the problem forms itself in the last 3 lines. I want to try to convert the matrix of values back into an Image with the matrix 'pixelMatrix' as it's value.
I've tried many things, but this seems to be the most obvious way:
im2 = Image.new('L',(kolommen,rijen))
im2.putdata(pixels)
im2.show()
When I use this, it just gives me a black image of the correct dimensions.
Any ideas on how to get back the original picture, starting from the values in my matrix pixelMatrix?
Post Scriptum: We still have to implement the transformation itself, but that would be useless unless we are sure we can convert a matrix back into a greyscaled image.
Related
I am using pyautogui.locateOnScreen() function to locate elements in chrome and get their x,y coordinates and click them. But at some point I need to take a screenshot of a part of the screen and search for the object I want in this screenshot. Then I get coordinates of it. Is it possible to do it with pyautogui?
My example code:
coord_one = pyautogui.locateOnScreen("first_image.png",confidence=0.95)
scshoot = pyautogui.screenshot(region=coord_one)
coord_two = # search second image in scshoot and if it can be detected get coordinates of it.
If it is not possible with pyautogui, can you advice the easiest-smartest way?
Thanks in advance.
I don't believe there is a built-in direct way to do what you need but the python-opencv library does the job.
The following code sample assumes you have an screen capture you just took "capture.png" and you want to find "logo.png" in that capture, which you know is an subsection of "capture.png".
Minimal example
"""Get bounding box of cropped image from original image."""
import cv2 as cv
import numpy as np
img_rgb = cv.imread(r'res/original.png')
# the cropped image, expected to be smaller
target_img = cv.imread(r'res/crop.png')
_, w, h = target_img.shape[::-1]
res = cv.matchTemplate(img_rgb,target_img,cv.TM_CCOEFF_NORMED)
# with the method used, the date in res are top left pixel coords
min_val, max_val, min_loc, max_loc = cv.minMaxLoc(res)
top_left = max_loc
# if we add to it the width and height of the target, then we get the bbox.
bottom_right = (top_left[0] + w, top_left[1] + h)
cv.rectangle(img_rgb,top_left, bottom_right, 255, 2)
cv.imshow('', img_rgb)
MatchTemplate
From the docs, MatchTemplate "simply slides the template image over the input image (as in 2D convolution) and compares the template and patch of input image under the template image." Under the hood, this offers methods such as square difference to compare the images represented as arrays.
See more
For a more in-depth explanation, check the opencv docs as the code is entirely based off their example.
I just start the module worcloud in Python 3.7, and I'm using the next cxode to generate wordclouds from a dictionary and I'm trying to use differents masks, but this works for some images: in two cases works with images of 831x816 and 1000x808. This has to be with the size of the image? Or is because the images is kind a blurry? Or what is it?
I paste my code:
from PIL import Image
our_mask = np.array(Image.open('twitter.png'))
twitter_cloud = WordCloud(background_color = 'white', mask = our_mask)
twitter_cloud.generate_from_frequencies(frequencies)
twitter_cloud.to_file("twitter_cloud.jpg")
plt.imshow(twitter_cloud)
plt.axis('off')
plt.show()
How can i fix this?
I had a similar problem with a black-and-white image I used. What fixed it for me was when I cropped the image more closely to the black drawing so there was no unnecessary bulk white area on the edges.
Some images should be adjusted for the process. Note only white point values for image is mask_out (other values are mask_in). The problem is that some of images are not suitable for masking. The reason is that the color's np.array somewhat mismatches. To solve this, following can be done:
1.Creating mask object: (Please try with your own image as I couldn't upload:)
import numpy as np;
import pandas as pd;
from PIL import Image;
from wordcloud import WordCloud
mask = np.array(Image.open("filepath/picture.png"))
print(mask)
If the output values for white np.array is 255, then it is okay. But if it is 0 or probably other value, we have to change this to 255.
2.In the case of other values, the code for changing the values:
2-1. Create function for transforming (here our value = 0)
def transform_zeros(val):
if val == 0:
return 255
else:
return val
2-2. Creating the same shaped np.array:
maskable_image = np.ndarray((mask.shape[0],mask.shape[1]), np.int32)
2-3. Transformation:
for i in range(len(mask)):
maskable_image[i] = list(map(transform_zeros, mask[i]))
3.Checking:
print(maskable_image)
Then you can use this array for your mask.
mask = maskable_image
All this is copied and interpreted from this link, so check it if you find my attempted explanation unclear, as I just provided solution but don't understand that much about color arrays of image and its transformation.
I'm working on a project related to road recognition from a standard Google Map view. Some navigation features will be added to the project later on.
I already extracted all the white pixels (representing road on the map) according to the RGB criteria. Also, I stored all the white pixel (roads) coordinates (2D) in one list named "all_roads". Now I want to extract each road in terms of the pixel coordinates and place them into different lists (one road in one list), but I'm lacking ideas.
I'd like to use Dijkstra's algorithm to calculate the shortest path between two points, but I need to create "nodes" on each road intersection. That's why I'd like to store each road in the corresponding list for further processing.
I hope someone could provide some ideas and methods. Thank you!
Note: The RGB criteria ("if" statements in "threshold" method) seems unnecessary for the chosen map screenshot, but it becomes useful in some other map screenshot with other road colours other than white. (NOT the point of the question anyway but I hope to avoid unnecessary confusion)
# Import numpy to enable numpy array
import numpy as np
# Import time to handle time-related task
import time
# Import mean to calculate the averages of the pixals
from statistics import mean
# Import cv2 to display the image
import cv2 as cv2
def threshold(imageArray):
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Purpose: Display a given image with road in white according to pixel RGBs
Argument(s): A matrix generated from a given image.
Return: A matrix of the same size but only displays white and black.
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
newAr = imageArray
for eachRow in newAr:
for eachPix in eachRow:
if eachPix[0] == 253 and eachPix[1] == 242:
eachPix[0] = 255
eachPix[1] = 255
eachPix[2] = 255
else:
pass
return newAr
# Import the image
g1 = cv2.imread("1.png")
# fix the output image with resolution of 800 * 600
g1 = cv2.resize(g1,(800,600))
# Apply threshold method to the imported image
g2 = threshold(g1)
index = np.where(g2 == [(255,255,255)])
# x coordinate of the white pixels (roads)
print(index[1])
# y coordinate of the white pixels (roads)
print(index[0])
# Storing the 2D coordinates of white pixels (roads) in a list
all_roads = []
for i in range(len(index[0]))[0::3]:
all_roads.append([index[1][i], index[0][i]])
#Display the modified image
cv2.imshow('g2', g2)
cv2.waitKey(0)
cv2.destroyAllWindows()
Hi using the sample image phantom.png I'm following some operations with numpy + skimage libraries and after some modifications the last one exercise ask for:
Compress the size of center spots by 50% and plot the final image.
These are the steps I do before.
I read the image doing
img = imread(os.path.join(data_dir, 'phantom.png'))
Then apply following to make it black and white
img[np.less_equal(img[:,:,0],50)] = 0
img[np.greater_equal(img[:,:,0],51)] = 255
Took couple of slices of the image (the black spots) with given coordinates
img_slice=img.copy()
img_slice=img_slice[100:300, 100:200]
img_slice2=img.copy()
img_slice2=img_slice2[100:300, 200:300]
Now flip them
img_slice=np.fliplr(img_slice)
img_slice2=np.fliplr(img_slice2)
And put them back into an image copy
img2=img.copy()
img2[100:300, 200:300]=img_slice
img2[100:300, 100:200]=img_slice2
And this is the resulting image before the final ("compress") excersise:
Then I'm asked to "reduce" the black spots by using the numpy.compress method.
The expected result after using "compress" method is the following image (screenshot) where the black spots are reduced by 50%:
But I have no clue of how to use numpy.compress method over the image or image slices to get that result, not even close, all what I get is just chunks of the image that looks like cropped or stretched portions of it.
I will appreciate any help/explanation about how the numpy.compress method works for this matter and even if is feasible to use it for this.
You seem ok with cropping and extracting, but just stuck on the compress aspect. So, crop out the middle and save that as im and we will compress that in the next step. Fill the area you cropped from with white.
Now, compress the part you cropped out. In order to reduce by 50%, you need to take alternate rows and alternate columns, so:
# Generate a vector alternating between True and False the same height as "im"
a = [(i%2)==0 for i in range(im.shape[0])]
# Likewise for the width
b = [(i%2)==0 for i in range(im.shape[1])]
# Now take alternate rows with numpy.compress()
r = np.compress(a,im,0)
# And now take alternate columns with numpy.compress()
res = np.compress(b,r,1)
Finally put res back in the original image, offset by half its width and height relative to where you cut it from.
I guess you can slice off the center spots first by :
center_spots = img2[100:300,100:300]
Then you can replace the center spots values in the original image with 255 (white)
img2[100:300,100:300] = 255
then compress center_spots by 50% along both axes and add the resultant back to img2
the compressed image shape will be (100,100), so add to img2[150:250,150:250]
Check the below code for the output you want. Comment if you need explanation for the below code.
import os.path
from skimage.io import imread
from skimage import data_dir
import matplotlib.pyplot as plt
import numpy as np
img = imread(os.path.join(data_dir, 'phantom.png'))
img[np.less_equal(img[:,:,0],50)] = 0
img[np.greater_equal(img[:,:,0],51)] = 255
img_slice=img[100:300,100:200]
img_slice2=img[100:300,200:300]
img_slice=np.fliplr(img_slice)
img_slice2=np.fliplr(img_slice2)
img2=img.copy()
img2[100:300, 200:300]=img_slice
img2[100:300, 100:200]=img_slice2
#extract the left and right images
img_left = img2[100:300,100:200]
img_right = img2[100:300,200:300]
#reduce the size of the images extracted using compress
#numpy.compress([list of states as True,False... or 1,0,1...], axis = (0 for column-wise and 1 for row-wise))
#In state list whatever is False or 0 that particular row should will be removed from that matrix or image
#note: len(A) -> number of rows and len(A[0]) number of columns
#reducing the height-> axis = 0
img_left = img_left.compress([not(i%2) for i in range(len(img_left))],axis = 0)
#reducing the width-> axis = 1
img_left = img_left.compress([not(i%2) for i in range(len(img_left[0]))],axis = 1)
#reducing the height-> axis = 0
img_right = img_right.compress([not(i%2) for i in range(len(img_right))],axis = 0)
#reducing the width-> axis = 1
img_right = img_right.compress([not(i%2) for i in range(len(img_right[0]))],axis = 1)
#clearing the area before pasting the left and right minimized images
img2[100:300,100:200] = 255 #255 is for whitening the pixel
img2[100:300,200:300] = 255
#paste the reduced size images back into the main picture(but notice the coordinates!)
img2[150:250,125:175] = img_left
img2[150:250,225:275] = img_right
plt.imshow(img2)
numpy.compress document here.
eyes = copy[100:300,100:300]
eyes1 = eyes
e = [(i%2 == 0) for i in range(eyes.shape[0])]
f = [(i%2 == 0) for i in range(eyes.shape[1])]
eyes1 = eyes1.compress(e,axis = 0)
eyes1 = eyes1.compress(f,axis = 1)
# plt.imshow(eyes1)
copy[100:300,100:300] = 255
copy[150:250,150:250] = eyes1
plt.imshow(copy)
I am reading an image from SimpleITK but I get these results in vtk any help?
I am not sure where things are going wrong here.
Please see image here.
####
CODE
def sitk2vtk(img):
size = list(img.GetSize())
origin = list(img.GetOrigin())
spacing = list(img.GetSpacing())
sitktype = img.GetPixelID()
vtktype = pixelmap[sitktype]
ncomp = img.GetNumberOfComponentsPerPixel()
# there doesn't seem to be a way to specify the image orientation in VTK
# convert the SimpleITK image to a numpy array
i2 = sitk.GetArrayFromImage(img)
#import pylab
#i2 = reshape(i2, size)
i2_string = i2.tostring()
# send the numpy array to VTK with a vtkImageImport object
dataImporter = vtk.vtkImageImport()
dataImporter.CopyImportVoidPointer( i2_string, len(i2_string) )
dataImporter.SetDataScalarType(vtktype)
dataImporter.SetNumberOfScalarComponents(ncomp)
# VTK expects 3-dimensional parameters
if len(size) == 2:
size.append(1)
if len(origin) == 2:
origin.append(0.0)
if len(spacing) == 2:
spacing.append(spacing[0])
# Set the new VTK image's parameters
#
dataImporter.SetDataExtent (0, size[0]-1, 0, size[1]-1, 0, size[2]-1)
dataImporter.SetWholeExtent(0, size[0]-1, 0, size[1]-1, 0, size[2]-1)
dataImporter.SetDataOrigin(origin)
dataImporter.SetDataSpacing(spacing)
dataImporter.Update()
vtk_image = dataImporter.GetOutput()
return vtk_image
###
END CODE
You are ignoring two things:
There is an order change when you perform GetArrayFromImage:
The order of index and dimensions need careful attention during conversion. Quote from SimpleITK Notebooks at http://insightsoftwareconsortium.github.io/SimpleITK-Notebooks/01_Image_Basics.html:
ITK's Image class does not have a bracket operator. It has a GetPixel which takes an ITK Index object as an argument, which is an array ordered as (x,y,z). This is the convention that SimpleITK's Image class uses for the GetPixel method as well.
While in numpy, an array is indexed in the opposite order (z,y,x).
There is a change of coordinates between ITK and VTK image representations. Historically, in computer graphics there is a tendency to align the camera in such a way that the positive Y axis is pointing down. This results in a change of coordinates between ITK and VTK images.