Theano Reshaping - reshape

I am unable to clearly comprehend theano's reshape. I have an image matrix of shape:
[batch_size, stack1_size, stack2_size, height, width]
, where there are stack2_size stacks of images, each having stack1_size of channels. I now want to convert them into the following shape:
[batch_size, stack1_size*stack2_size, 1 , height, width]
such that all the stacks will be combined together into one stack of all channels. I am not sure if reshape will do this for me. I see that reshape seems to not lexicographically order the pixels if they are mixed in dimensions in the middle. I have been trying to achieve this with a combination of dimshuffle,reshape and concatenate, but to no avail. I would appreciate some help.
Thanks.

Theano reshape works just like numpy reshape with its default order, i.e. 'C':
ā€˜Cā€™ means to read / write the elements using C-like index order, with
the last axis index changing fastest, back to the first axis index
changing slowest.
Here's an example showing that the image pixels remain in the same order after a reshape via either numpy or Theano.
import numpy
import theano
import theano.tensor
def main():
batch_size = 2
stack1_size = 3
stack2_size = 4
height = 5
width = 6
data = numpy.arange(batch_size * stack1_size * stack2_size * height * width).reshape(
(batch_size, stack1_size, stack2_size, height, width))
reshaped_data = data.reshape([batch_size, stack1_size * stack2_size, 1, height, width])
print data[0, 0, 0]
print reshaped_data[0, 0, 0]
x = theano.tensor.TensorType('int64', (False,) * 5)()
reshaped_x = x.reshape((x.shape[0], x.shape[1] * x.shape[2], 1, x.shape[3], x.shape[4]))
f = theano.function(inputs=[x], outputs=reshaped_x)
print f(data)[0, 0, 0]
main()

Related

Generate 8 bit image with numpy

I'm trying to generate an image of all 8 bit colours. And this is the important bit: 1 pixel represents 1 unique colour. That's 2^8 or 256 colours - should be a 32 x 32 image.
The plan is to be able to change the bit depth and create a different image. ie 65536 colours for 16 bit.
Here's what I have:
import numpy as np
from PIL import Image
# --------------------------------------------------------------
def create_image(output, width, height, pixels):
# Convert the pixels into an array using numpy
array = np.array(pixels, dtype=np.uint8)
img = Image.fromarray(array)
img.save(output)
# --------------------------------------------------------------
bit = 8
cmap = plt.get_cmap("viridis", 2**bit)
a = cmap(np.linspace(0,1,2**bit))
numOfCols = (len(a)) # number of cols
x = int(np.sqrt(2**bit)*2)
y = int(np.sqrt(2**bit)*2)
arr = np.reshape(a, (x, y))
create_image("test.png", x, y, arr)
I'm new to numpy and I may have the initial size of the array wrong, as I get an error
ValueError: cannot reshape array of size 1024 into shape (16,16)
if I try to force it into an array that's 16 x 16.
Secondly, the image is just black, which is great for coffee, not so good for my results.
How do I transfer the array with all the colour data to the image properly?
First of all, your colormap generates an array of values in the following fashion:
In [71]: mymap = cmap(np.linspace(0, 1, 2 ** bit))
In [72]: mymap
Out[72]:
array([[0.267004, 0.004874, 0.329415, 1. ],
[0.26851 , 0.009605, 0.335427, 1. ],
[0.269944, 0.014625, 0.341379, 1. ],
...,
[0.974417, 0.90359 , 0.130215, 1. ],
[0.983868, 0.904867, 0.136897, 1. ],
[0.993248, 0.906157, 0.143936, 1. ]])
In this question, it's noted that PIL cannot handle the 32-bit floating point RGB format.
It does support tuples of 3 8-bit integers, so our goal is to make these things integer and scale them to 0-255 range. And remove the last column (opacity).
# Filter out ones
mymap = mymap[:, :-1]
# Multiply by 256 and convert to uint8
mymap = np.uint8(mymap * 256)
Now we have to properly reshape it into a 16x16 array.
You actually have to reshape into (16, 16, 3), as the result would be a 3d array.
mymap = mymap.reshape(16, 16 ,3)
And, finally, make a PIL image out of that and write out
img = Image.fromarray(mymap)
img.save("output.png")
My result looks like this: ( please zoom in as it's only 16x16 pixels )

Maxpool of an image in pytorch

I'm trying to just apply maxpool2d (from torch.nn) on a single image (not as a maxpool layer). Here is my code right now:
name = 'astronaut'
imshow(images[name], name)
img = images[name]
# pool of square window of size=3, stride=1
m = nn.MaxPool2d(3,stride = 1)
img_transform = torch.Tensor(images[name])
plt.imshow(m(img_transform).view((512,510)))
The issue is, this code gives me a very green image as a result. I am sure the problem is with the dimensions of view, but I was unable to find how to apply maxpool to just one image so I couldn't fix it. The dimension of the image I'm considering is 512x512. The arguments for view make no sense for me right now, it's just the only number that gives a result...
If for example, I gave 512,512 as the argument for view, I get the following error:
RuntimeError: shape '[512, 512]' is invalid for input of size 261120
If anyone can tell me how to apply maxpool, avgpool, or minpool to an image and display the result I would be super grateful!
Thanks (:
Assuming your image is a numpy.array upon loading (please see comments for explanation of each step):
import numpy as np
import torch
# Assuming you have 3 color channels in your image
# Assuming your data is in Width, Height, Channels format
numpy_img = np.random.randint(low=0, high=255, size=(512, 512, 3))
# Transform to tensor
tensor_img = torch.from_numpy(numpy_img)
# PyTorch takes images in format Channels, Width, Height
# We have to switch their dimensions using `permute`
tensor_img = tensor_img.permute(2, 0, 1)
tensor_img.shape # Shape [3, 512, 512]
# Layers always need batch as first dimension (even for one image)
# unsqueeze will add it for you
ready_tensor_img = tensor_img.unsqueeze(dim=0)
ready_tensor_img.shape # Shape [1, 3, 512, 512]
pooling = torch.nn.MaxPool2d(kernel_size=3, stride=1)
# You need to cast your image to float as
# pooling is not implemented for Tensors of type long
new_img = pooling(ready_tensor_img.float())
If your image is black and white you would need shape [1, 1, 512, 512] (single channel only), you can't leave/squeeze those dimensions, they always have to be there for any torch.nn.Module!
To transform tensor into image again you could use similar steps:
# Cast to long and squeeze batch dimension
no_batch = new_img.long().squeeze(dim=0)
# Unpermute
width_height_channels = no_batch.permute(1, 2, 0)
width_height_channels.shape # Shape: [510, 510, 3]
# Cast to numpy and you have your image
final_image = width_height_channels.numpy()

How to deform/scale a 3 dimensional numpy array in one dimension?

I would like to deform/scale a three dimensional numpy array in one dimension. I will visualize my problem in 2D:
I have the original image, which is a 2D numpy array:
Then I want to deform/scale it for some factor in dimension 0, or horizontal dimension:
For PIL images, there are a lot of solutions, for example in pytorch, but what if I have a numpy array of shapes (w, h, d) = (288, 288, 468)? I would like to upsample the width with a factor of 1.04, for example, to (299, 288, 468). Each cell contains a normalized number between 0 and 1.
I am not sure, if I am simply not looking for the correct vocabulary, if I try to search online. So also correcting my question would help. Or tell me the mathematical background of this problem, then I can write the code on my own.
Thank you!
You can repeat the array along the specific axis a number of times equal to ceil(factor) where factor > 1 and then evenly space indices on the stretched dimension to select int(factor * old_length) elements. This does not perform any kind of interpolation but just repeats some of the elements:
import math
import cv2
import numpy as np
from scipy.ndimage import imread
img = imread('/tmp/example.png')
print(img.shape) # (512, 512)
axis = 1
factor = 1.25
stretched = np.repeat(img, math.ceil(factor), axis=axis)
print(stretched.shape) # (512, 1024)
indices = np.linspace(0, stretched.shape[axis] - 1, int(img.shape[axis] * factor))
indices = np.rint(indices).astype(int)
result = np.take(stretched, indices, axis=axis)
print(result.shape) # (512, 640)
cv2.imwrite('/tmp/stretched.png', result)
This is the result (left is original example.png and right is stretched.png):
Looks like it is as easy as using the torch.nn.functional.interpolate functional from pytorch and choosing 'trilinear' as interpolation mode:
import torch
PET = torch.tensor(data)
print("Old shape = {}".format(PET.shape))
scale_factor_x = 1.4
# Scaling.
PET = torch.nn.functional.interpolate(PET.unsqueeze(0).unsqueeze(0),\
scale_factor=(scale_factor_x, 1, 1), mode='trilinear').squeeze().squeeze()
print("New shape = {}".format(PET.shape))
output:
>>> Old shape = torch.Size([288, 288, 468])
>>> New shape = torch.Size([403, 288, 468])
I verified the results by looking at the data, but I can't show them here due to data privacy. Sorry!
This is an example for linear up-sampling a 3D Image with scipy.interpolate, hope it helps.
(I worked quite a lot with np.meshgrid here, if you not familiar with it i recently explained it here)
import numpy as np
import matplotlib.pyplot as plt
import scipy
from scipy.interpolate import RegularGridInterpolator
# should be 1.3.0
print(scipy.__version__)
# =============================================================================
# producing a test image "image3D"
# =============================================================================
def some_function(x,y,z):
# output is a 3D Gaussian with some periodic modification
# its only for testing so this part is not impotent
out = np.sin(2*np.pi*x)*np.cos(np.pi*y)*np.cos(4*np.pi*z)*np.exp(-(x**2+y**2+z**2))
return out
# define a grid to evaluate the function on.
# the dimension of the 3D-Image will be (20,20,20)
N = 20
x = np.linspace(-1,1,N)
y = np.linspace(-1,1,N)
z = np.linspace(-1,1,N)
xx, yy, zz = np.meshgrid(x,y,z,indexing ='ij')
image3D = some_function(xx,yy,zz)
# =============================================================================
# plot the testimage "image3D"
# you will see 5 images that corresponds to the slicing of the
# z-axis similar to your example picture_
# https://sites.google.com/site/linhvtlam2/fl7_ctslices.jpg
# =============================================================================
def plot_slices(image_3d):
f, loax = plt.subplots(1,5,figsize=(15,5))
loax = loax.flatten()
for ii,i in enumerate([8,9,10,11,12]):
loax[ii].imshow(image_3d[:,:,i],vmin=image_3d.min(),vmax=image_3d.max())
plt.show()
plot_slices(image3D)
# =============================================================================
# interpolate the image
# =============================================================================
interpolation_function = RegularGridInterpolator((x, y, z), image3D, method = 'linear')
# =============================================================================
# evaluate at new grid
# =============================================================================
# create the new grid that you want
x_new = np.linspace(-1,1,30)
y_new = np.linspace(-1,1,40)
z_new = np.linspace(-1,1,N)
xx_new, yy_new, zz_new = np.meshgrid(x_new,y_new,z_new,indexing ='ij')
# change the order of the points to match the input shape of the interpolation
# function. That's a bit messy but i couldn't figure out a way around that
evaluation_points = np.rollaxis(np.array([xx_new,yy_new,zz_new]),0,4)
interpolated = interpolation_function(evaluation_points)
plot_slices(interpolated)
The original (20,20,20) dimensional 3D Image:
And the upsampeled (30,40,20) dimensional 3D Image:

Does tensorflow have shape variables?

Often, I'd like to work with variable-size data, e.g. the number of samples.
To get this data into tensorflow, I use python variables (e.g. "num_samples=2000") to define shapes.
This means I have to re-create a new graph for each number of samples.
Setting validate_shape=False is not an option to me.
Is there a Tensorflow-way of having dimension sizes as variables?
tf.placeholder() allows you to create tensors which will be filled only at runtime ; and it allows to define tensors with variable-size dimensions using None in their shape.
tf.shape() gives you the dynamic size of a tensor, itself as a tensor (actually as a tf.TensorShape, which you can use e.g. to dynamically generate other tensors). See tf.TensorShape for more detailed explanations.
An example to hopefully make things clearer:
import tensorflow as tf
import numpy as np
# Creating a placeholder for 3-channel images with undefined batche size, height and width:
images = tf.placeholder(tf.float32, shape=(None, None, None, 3))
# Dynamically obtaining the actual shape of the images:
images_shape = tf.shape(images)
# Demonstrating how this shape can be use to dynamically create other tensors:
ones = tf.ones(images_shape, dtype=images.dtype)
images_plus1 = images + ones
with tf.Session() as sess:
for i in range(2):
# Generating a random number of images with a random HxW:
num_images = np.random.randint(1, 10)
height, width = np.random.randint(10, 20), np.random.randint(10, 20)
images_zero = np.zeros((num_images, height, width, 3), dtype=np.float32)
# Running our TF operation, feeding the placeholder with the actual images:
res = sess.run(images_plus1, feed_dict={images: images_zero})
print("Shape: {} ; Pixel Val: {}".format(res.shape, res[0, 0, 0]))
# > Shape: (6, 14, 13, 3) ; Pixel Val: [1. 1. 1.]
# > Shape: (8, 11, 15, 3) ; Pixel Val: [1. 1. 1.]
# ^ As you can see, we run the same graph each time with a different number of
# images / different shape

Python Shape function for k-means clustering

I have one geotiff grey scale image which gave me the (4377, 6172) 2D array. In the first part, I am considering (:1024, :1024) values(Total values are -> 1024 * 1024 = 1048576) for my compression algorithm. Through this algorithm, I am getting total 4 values in finalmatrix list var through the algorithm. After this, I am applying K-means algorithm on that values. A program is below :
import numpy as np
from osgeo import gdal
from sklearn import cluster
import matplotlib.pyplot as plt
dataset =gdal.Open("1.tif")
band = dataset.GetRasterBand(1)
img = band.ReadAsArray()
finalmat = [255, 0, 2, 2]
#Converting list to array for dimensional change
ay = np.asarray(finalmat).reshape(-1,1)
fig = plt.figure()
k_means = cluster.KMeans(n_clusters=2)
k_means.fit(ay)
cluster_means = k_means.cluster_centers_.squeeze()
a_clustered = k_means.labels_
print('# of observation :',ay.shape)
print('Cluster Means : ', cluster_means)
a_clustered.shape= img.shape
fig=plt.figure(figsize=(125,125))
ax = plt.subplot(2,4,8)
plt.axis('off')
xlabel = str(1) , ' clusters'
ax.set_title(xlabel)
plt.imshow(a_clustered)
plt.show()
fig.savefig('kmeans-1 clust ndvi08jan2010_guj 12 .png')
In the above Program I am getting error in the line a_clustered.shape= img.shape. The error which I am getting is below:
Error line:
a_clustered.shape= img.shape
ValueError: cannot reshape array of size 4 into shape (4377,6172)
<matplotlib.figure.Figure at 0x7fb7c63975c0>
Actually, I want to visualize the clustering on Original image through compressed value which I am getting. Can you please give suggestion what to do
It does not make a lot of sense to use KMeans on 1 dimensional data.
And it makes even less sense to use it on a 4 x 1 array!
Your site then comes from the fact that you can't just resize a 4 x 1 integer array into a large picture.
Just print the array a_clustered you are trying to plot. It probably contains [0, 1, 1, 1].

Resources