Pytorch tensor changing colors - pytorch

I would appreciate any help on that.
Why after putting tensor of 3d (image) into 4d tensor, the image colors changed.
p = "path/to/image"
p = Image.open(p)
p = transforms.PILToTensor()(p)
transforms.ToPILImage()(p).show() # ok (left pic)
temp = torch.zeros(4, p.size()[0], p.size()[1], p.size()[2])
temp[0] = p
transforms.ToPILImage()(temp[0]).show() # not ok (right pic)

The reason is that the first tensor p is an integer tensor and values range between 0 - 255. The second image is a float tensor and the values range between 0.0 - 255.0. imshow function expects integer values between 0 - 255 or float values between 0 - 1, you can read more here.
To fix this problem, you have two options either add the dtype=torch.uint8 when you define a temp tensor or divide the values of the tensor by 255 to scale it between 0 -1.
# cell 1
from PIL import Image
from torchvision import transforms
import torch
from matplotlib import pyplot as plt
p = Image.open("pi.png")
p = transforms.PILToTensor()(p).permute(1, 2, 0)
plt.imshow( p ) #ok
# cell 2
temp = torch.zeros(4, p.size()[0], p.size()[1], p.size()[2], dtype=torch.uint8)
temp[0] = p
plt.imshow(temp[0]) # or you can use plt.imshow(temp[0]/255)

Related

How can I convert XYZ point cloud to binary mask image

I want to convert a set of point cloud (X, Y, Z) to a binary mask image using python. The problem is that these points are float and out of range of 0-255. To more specific, the points are related to an object (rectangle or ellipsoid), so I should make a binary image based on Z dimension, to specify the rectangle, for example, as 0 number and other points as 1 number in binary mask.
Can anyone give me some ideas to achieve my goal?
My point is like this array:
[[-1.56675167e+01 1.59539632e+01 1.15432026e-02]
[-1.26066835e+01 6.48645007e+00 1.15510724e-02]
[-1.18854252e+01 1.71767061e+01 1.15392632e-02]
...
[-1.45721083e+01 1.39116935e+01 -9.86438582e-04]
[-1.42607847e+01 1.28141373e+01 -1.73514791e-03]
[-1.48834319e+01 1.50092497e+01 7.59929187e-04]]
I was tried to get such binary mask that was answered in this example ():
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.path import Path
from descartes import PolygonPatch
import alphashape
from shapely.geometry import Point, Polygon
def poly2mask():
# First of all, I separated the contour of the polygon and get vertices
# in the border
hull = alphashape.alphashape(surface_org, 0.) # convex hull
poly = PolygonPatch(hull, alpha=0.2, edgecolor='#999999')
vertices = poly.get_path().vertices
x = vertices[:, 0] * 10
y = vertices[:, 1] * 10
vertices_ls = list(zip(x, y))
width, height = 120, 120
poly_path = Path(vertices_ls, closed=True)
x, y = np.mgrid[:height, :width]
coors = np.hstack((x.reshape(-1, 1), y.reshape(-1, 1)))
mask = poly_path.contains_points(coors)
mask = mask.reshape(height, width)
#print(mask)
plt.imshow(mask)
plt.ylim(-200, 200)
plt.xlim(-200, 200)
plt.show()
The image would look like this:
enter image description here

Generate 8 bit image with numpy

I'm trying to generate an image of all 8 bit colours. And this is the important bit: 1 pixel represents 1 unique colour. That's 2^8 or 256 colours - should be a 32 x 32 image.
The plan is to be able to change the bit depth and create a different image. ie 65536 colours for 16 bit.
Here's what I have:
import numpy as np
from PIL import Image
# --------------------------------------------------------------
def create_image(output, width, height, pixels):
# Convert the pixels into an array using numpy
array = np.array(pixels, dtype=np.uint8)
img = Image.fromarray(array)
img.save(output)
# --------------------------------------------------------------
bit = 8
cmap = plt.get_cmap("viridis", 2**bit)
a = cmap(np.linspace(0,1,2**bit))
numOfCols = (len(a)) # number of cols
x = int(np.sqrt(2**bit)*2)
y = int(np.sqrt(2**bit)*2)
arr = np.reshape(a, (x, y))
create_image("test.png", x, y, arr)
I'm new to numpy and I may have the initial size of the array wrong, as I get an error
ValueError: cannot reshape array of size 1024 into shape (16,16)
if I try to force it into an array that's 16 x 16.
Secondly, the image is just black, which is great for coffee, not so good for my results.
How do I transfer the array with all the colour data to the image properly?
First of all, your colormap generates an array of values in the following fashion:
In [71]: mymap = cmap(np.linspace(0, 1, 2 ** bit))
In [72]: mymap
Out[72]:
array([[0.267004, 0.004874, 0.329415, 1. ],
[0.26851 , 0.009605, 0.335427, 1. ],
[0.269944, 0.014625, 0.341379, 1. ],
...,
[0.974417, 0.90359 , 0.130215, 1. ],
[0.983868, 0.904867, 0.136897, 1. ],
[0.993248, 0.906157, 0.143936, 1. ]])
In this question, it's noted that PIL cannot handle the 32-bit floating point RGB format.
It does support tuples of 3 8-bit integers, so our goal is to make these things integer and scale them to 0-255 range. And remove the last column (opacity).
# Filter out ones
mymap = mymap[:, :-1]
# Multiply by 256 and convert to uint8
mymap = np.uint8(mymap * 256)
Now we have to properly reshape it into a 16x16 array.
You actually have to reshape into (16, 16, 3), as the result would be a 3d array.
mymap = mymap.reshape(16, 16 ,3)
And, finally, make a PIL image out of that and write out
img = Image.fromarray(mymap)
img.save("output.png")
My result looks like this: ( please zoom in as it's only 16x16 pixels )

Changing the pixel value of the top and bottom 10% of the image to 0 [OpenCV]

I tried this out with Numpy with the code below but didn't work out. Looking for the fastest way to do this:
img[img.shape[0]:int(img.shape[0]*0.1)] = 0
img[int(img.shape[0])*0.9):] = 0
img is a np.ndarray
Like this:
import numpy as np
# Create solid grey image
grey = np.full([100,250], 128, dtype=np.uint8)
# Determine how many rows
N = grey.shape[0]//10
# Make first and last N rows black
grey[:N, :] = 0
grey[-N:, :] = 0

How to deform/scale a 3 dimensional numpy array in one dimension?

I would like to deform/scale a three dimensional numpy array in one dimension. I will visualize my problem in 2D:
I have the original image, which is a 2D numpy array:
Then I want to deform/scale it for some factor in dimension 0, or horizontal dimension:
For PIL images, there are a lot of solutions, for example in pytorch, but what if I have a numpy array of shapes (w, h, d) = (288, 288, 468)? I would like to upsample the width with a factor of 1.04, for example, to (299, 288, 468). Each cell contains a normalized number between 0 and 1.
I am not sure, if I am simply not looking for the correct vocabulary, if I try to search online. So also correcting my question would help. Or tell me the mathematical background of this problem, then I can write the code on my own.
Thank you!
You can repeat the array along the specific axis a number of times equal to ceil(factor) where factor > 1 and then evenly space indices on the stretched dimension to select int(factor * old_length) elements. This does not perform any kind of interpolation but just repeats some of the elements:
import math
import cv2
import numpy as np
from scipy.ndimage import imread
img = imread('/tmp/example.png')
print(img.shape) # (512, 512)
axis = 1
factor = 1.25
stretched = np.repeat(img, math.ceil(factor), axis=axis)
print(stretched.shape) # (512, 1024)
indices = np.linspace(0, stretched.shape[axis] - 1, int(img.shape[axis] * factor))
indices = np.rint(indices).astype(int)
result = np.take(stretched, indices, axis=axis)
print(result.shape) # (512, 640)
cv2.imwrite('/tmp/stretched.png', result)
This is the result (left is original example.png and right is stretched.png):
Looks like it is as easy as using the torch.nn.functional.interpolate functional from pytorch and choosing 'trilinear' as interpolation mode:
import torch
PET = torch.tensor(data)
print("Old shape = {}".format(PET.shape))
scale_factor_x = 1.4
# Scaling.
PET = torch.nn.functional.interpolate(PET.unsqueeze(0).unsqueeze(0),\
scale_factor=(scale_factor_x, 1, 1), mode='trilinear').squeeze().squeeze()
print("New shape = {}".format(PET.shape))
output:
>>> Old shape = torch.Size([288, 288, 468])
>>> New shape = torch.Size([403, 288, 468])
I verified the results by looking at the data, but I can't show them here due to data privacy. Sorry!
This is an example for linear up-sampling a 3D Image with scipy.interpolate, hope it helps.
(I worked quite a lot with np.meshgrid here, if you not familiar with it i recently explained it here)
import numpy as np
import matplotlib.pyplot as plt
import scipy
from scipy.interpolate import RegularGridInterpolator
# should be 1.3.0
print(scipy.__version__)
# =============================================================================
# producing a test image "image3D"
# =============================================================================
def some_function(x,y,z):
# output is a 3D Gaussian with some periodic modification
# its only for testing so this part is not impotent
out = np.sin(2*np.pi*x)*np.cos(np.pi*y)*np.cos(4*np.pi*z)*np.exp(-(x**2+y**2+z**2))
return out
# define a grid to evaluate the function on.
# the dimension of the 3D-Image will be (20,20,20)
N = 20
x = np.linspace(-1,1,N)
y = np.linspace(-1,1,N)
z = np.linspace(-1,1,N)
xx, yy, zz = np.meshgrid(x,y,z,indexing ='ij')
image3D = some_function(xx,yy,zz)
# =============================================================================
# plot the testimage "image3D"
# you will see 5 images that corresponds to the slicing of the
# z-axis similar to your example picture_
# https://sites.google.com/site/linhvtlam2/fl7_ctslices.jpg
# =============================================================================
def plot_slices(image_3d):
f, loax = plt.subplots(1,5,figsize=(15,5))
loax = loax.flatten()
for ii,i in enumerate([8,9,10,11,12]):
loax[ii].imshow(image_3d[:,:,i],vmin=image_3d.min(),vmax=image_3d.max())
plt.show()
plot_slices(image3D)
# =============================================================================
# interpolate the image
# =============================================================================
interpolation_function = RegularGridInterpolator((x, y, z), image3D, method = 'linear')
# =============================================================================
# evaluate at new grid
# =============================================================================
# create the new grid that you want
x_new = np.linspace(-1,1,30)
y_new = np.linspace(-1,1,40)
z_new = np.linspace(-1,1,N)
xx_new, yy_new, zz_new = np.meshgrid(x_new,y_new,z_new,indexing ='ij')
# change the order of the points to match the input shape of the interpolation
# function. That's a bit messy but i couldn't figure out a way around that
evaluation_points = np.rollaxis(np.array([xx_new,yy_new,zz_new]),0,4)
interpolated = interpolation_function(evaluation_points)
plot_slices(interpolated)
The original (20,20,20) dimensional 3D Image:
And the upsampeled (30,40,20) dimensional 3D Image:

Index 150 out of bounds in axis0 with size 1

I was making histogram using numpy array in Python with open cv. The code is as follows:
#finding histogram of an image
import numpy as np
import cv2
img = cv2.imread("cr7.jpg")
gry_img=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
a=np.zeros((1,256),dtype=np.uint8)
#finding how many times a particular pixel intensity repeats
for x in range (0,183): #size of gray_img is (184,275)
for y in range (0,274):
g=gry_ img[x,y]
a[g]=a[g]+1
print(a)
Error is as follows:
IndexError: index 150 is out of bounds for axis 0 with size 1
Since you haven't supplied the image, it is only from guessing that it seems you've made a mistake with the dimensions of the image. Alternatively the issue is entirely with the shape of your results array a.
The code you have is rather fragile, and here is a cleaner way to interact with images. I use an image from opencv's data directory: aero1.jpg.
The code here resolves both potential issues identified above, whichever one it was:
fname = 'aero1.jpg'
im = cv2.imread(fname)
gry_img = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
gry_img.shape
>>> (480, 640)
# note that the image is 640pix wide by 480 tall;
# the numpy array shows the number of rows first.
# rows are in y / columns are in x
# NOTE the results array `a` need only be 1-dimensional, not 2d (1x256)
a=np.zeros((256, ), dtype=np.uint8)
# iterating over all pixels, whatever the shape of the image.
height, width = gry_img.shape
for x in xrange(width):
for y in xrange(height):
g = gry_img[y, x] # NOTE y, x not x, y
a[g] += 1
But note that you could also achieve this easily with a numpy function np.histogram (docs), with slightly careful handling of the bin edges.
histb, bin_edges = np.histogram(gry_img.reshape(-1), bins=xrange(0, 257))
# check that we arrived at the same result as iterating manually:
(a == histb).all()
>>> True

Resources