I have a RGB image tensor as (3,H,W), but the plt.imshow() can not show RGB image with this shape. I want to change the tensor to (H,W,3). How can I do that, is pytorch function .view() can do that?
Find the method. use pytorch permute() method, see details: https://www.geeksforgeeks.org/python-pytorch-permute-method/
code:
image.permute(1, 2, 0)
An alternative to using torch.Tensor.permute is to apply torch.Tensor.movedim:
image.movedim(0,-1)
Which is actually more general than image.permute(1,2,0), since it works for any number of dimensions. It has the effect of moving axis=0 to axis=-1 in a sort of insertion operation.
Or equivalently with Numpy, using np.moveaxis:
Please refer to this question
img_plot = img.numpy().transpose(1, 2, 0)
plt.imshow(img_plot)
Related
I am using the following code snippet for plotting confusion matrix using sklearn lib.
from sklearn.metrics import confusion_matrix,ConfusionMatrixDisplay
cm=confusion_matrix(y_test,y_pred,normalize='true')
disp=ConfusionMatrixDisplay(confusion_matrix=cm,display_labels=['anger','bordome','disgust','fear', 'happiness','sadness' ,'neutral'])
and the result is given below:
enter image description here
Try adding this in your code
disp.plot(xticks_rotation = 'vertical')
By default, it is displayed horizontally but this behavior can be changed.
You can find more details in the official documentation
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.ConfusionMatrixDisplay.html#sklearn.metrics.ConfusionMatrixDisplay
trying to run the visualization utils tutorial from pytorch, I tried it with some images of dogs found on the internet. the images used in the tutorial are not distributed for use.. making the gris and showing the result behaves funny - it shows each channel as a separate image (I guess this is what I see)
so - from the tutorial
but here is what I get from the images I got:
I was expecting to see the two images in their original colors in a grid.
Another step I tried following Ivan's comment:
tutorial: https://pytorch.org/vision/master/auto_examples/plot_visualization_utils.html
I would like to know how to fix this (and use make_grid correctly)
For the output you got, I would assume the correct shape is (height, width, channels) instead of (channels, height, width). You can correct this with torch.permute. The following should provide the desired result:
>>> grid = make_grid(torch.stack([transformed_dog1, transformed_dog2]).permute(0,3,1,2))
>>> show(grid)
I'm currently trying to perform a Polar to Cartesian Coordinate Image transformation, to display a raw sonar image into a 'fan-display'.
Initially I have a Numpy Array image of type np.float64, that can be seen below:
After doing some searching, I came across this StackOverflow post Inverse transform an image from Polar to Cartesian in OpenCV with a very similar problem, in which the poster seemed to have solved his/her issue by using the Python Wand library (http://docs.wand-py.org/en/0.5.9/index.html), specifically using their set of Distortion functions.
However, when I tried to use Wand and read the image in, I instead ended up with Wand getting the image below, which seems to be smaller than the original one. However, the weird thing is that img.size still gives the same size number as the original image's shape.
The code for this transformation can be seen below:
print(raw_img.shape)
wand_img = Image.from_array(raw_img.astype(np.uint8), channel_map="I") #=> (369, 256)
display(wand_img)
print("Current image size", wand_img.size) #=> "Current image size (369, 256)"
This is definitely quite problematic as Wand will automatically give the wrong 'fan image'. Is anybody familiar with this kind of problem with the Wand library previously, and if yes, may I ask what is the recommended solution to fix this issue?
If this issue isn't resolved soon I have an alternative backup of using OpenCV's cv::remap function (https://docs.opencv.org/4.1.2/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121). However the problem with this is that I'm not sure what mapping arrays (i.e. map_x and map_y) to use to perform the Polar->Cartesian transformation, as using a mapping matrix that implements the transformation equations below:
r = polar_distances(raw_img)
x = r * cos(theta)
y = r * sin(theta)
didn't seem to work and instead threw out errors from OpenCV as well.
Any kind of help and insight into this issue is greatly appreciated. Thank you!
- NickS
EDIT I've tried on another image example as well, and it still shows a similar problem. So first, I imported the image into Python using OpenCV, using these lines of code:
import matplotlib.pyplot as plt
from wand.image import Image
from wand.display import display
import cv2
img = cv2.imread("Test_Img.jpg")
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.figure()
plt.imshow(img_rgb)
plt.show()
which showed the following display as a result:
However, as I continued and tried to open the img_rgb object with Wand, using the code below:
wand_img = Image.from_array(img_rgb)
display(img_rgb)
I'm getting the following result instead.
I tried to open the image using wand.image.Image() on the file directly, which is able to display the image correctly when using display() function, so I believe that there isn't anything wrong with the wand library installation on the system.
Is there a missing step that I required to convert the numpy into Wand Image that I'm missing? If so, what would it be and what is the suggested method to do so?
Please do keep in mind that I'm stressing the conversion of Numpy to Wand Image quite crucial, the raw sonar images are stored as binary data, thus the required use of Numpy to convert them to proper images.
Is there a missing step that I required to convert the numpy into Wand Image that I'm missing?
No, but there is a bug in Wand's Numpy implementation in Wand 0.5.x. The shape of OpenCV's ndarray is (ROWS, COLUMNS, CHANNELS), but Wand's ndarray is (WIDTH, HEIGHT, CHANNELS). I believe this has been fixed for the future 0.6.x releases.
If so, what would it be and what is the suggested method to do so?
Swap the values in img_rgb.shape before passing to Wand.
img_rgb.shape = (img_rgb.shape[1], img_rgb.shape[0], img_rgb.shape[2],)
with Image.from_array(img_rgb) as img:
display(img)
I have a hard time, figuring out a proper affine transformation for 3 different views i.e. coronal, axial and saggital, each having separate issues like below:
1: Axial color map get overlapped with the saggital original view.
2: Similarly Sagittal color map gets overlapped with the axial original image.
3: And everyone has some kind of orientation issues like best visible here when the color map and original image for coronal come correct but with wrong orientation.
I am saving the original file that I am sending to the server for some kind of prediction, which generates a color map and returns that file for visualization, later I am displaying everything together.
In server after prediction, here is the code to save the file.
nifti_img = nib.MGHImage(idx, affine, header=header)
Whereas affine and header are the original affine and header extracted from the file I sent.
I need to process "idx" value that holds the raw data in Numpy array format, but not sure what exactly to be done. Need help here.
Was trying hard to solve the issue using nibabel python library, but due to very limited knowledge of mine about how these files work and about affine transformation, I am having a hard time figuring out what should I do to make them correct.
I am using AMI js with threejs support in the frontend and nibabel with python in the back end. Solution on the frontend or back end anywhere is acceptable.
Please help. Thanks in advance.
img = nib.load(img_path)
# check the orientation you wanna reorient.
# For example, the original orientation of img is RPI,
# you wanna reorient it to RAS, the second the third axes should be flipped
# ornt[P, 1] is flip of axis N, where 1 means no flip and -1 means flip.
ornt = np.array([[0, 1],
[1, -1],
[2, -1]])
img_orient = img.as_reoriented(ornt)
nib.save(img_orient, img_path)
It was simple, using numpy.moveaxis() and numpy.flip() operation on rawdata from nibabel. as below.
# Getting raw data back to process for better orienation and label mapping.
orig_img_data = nib.MGHImage(numpy_arr, affine)
nifti_img = nib.MGHImage(segmented_arr_output, affine)
# Getting original and predicted data to preprocess to original shape and view for visualisation.
orig_img = orig_img_data.get_fdata()
seg_img = nifti_img.get_fdata()
# Placing proper views in proper place and flipping it for a better visualisation as required.
# moveaxis to get original order.
orig_img_ = np.moveaxis(orig_img, -1, 0)
seg_img = np.moveaxis(seg_img, -1, 0)
# Flip axis to overcome mirror image/ flipped view.
orig_img_ = np.flip(orig_img_, 2)
seg_img = np.flip(seg_img, 2)
orig_img_data_ = nib.MGHImage(orig_img_.astype(np.uint8), np.eye(4), header)
nifti_img_ = nib.MGHImage(seg_img.astype(np.uint8), np.eye(4), header)
Note: It's very important to have same affine matrix to wrap both of these array back. A 4*4 Identity matrix is better rather than using original affine matrix as that was creating problem for me.
I encounter some problems with the Google Earth Engine python API to generate a RGB image based on an ImageCollection.
Basically to transform the ImageCollection into an Image, I apply a median reduction. After this reduction, I apply the visualize function where I need to define the different variables like the min and max. The problem is that these two values are image dependent.
dataset = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')
.filterBounds(ee.Geometry.Polygon([[39.05789266, 13.59051553],
[39.11335033, 13.59051553],
[39.11335033, 13.64477783],
[39.05789266, 13.64477783],
[39.05789266, 13.59051553]]))
.filterDate('2016-01-01', '2016-12-31')
.select(['B4', 'B3', 'B2'])
reduction = dataset.reduce('median')
.visualize(bands=['B4_median', 'B3_median', 'B2_median'],
min=0,
max=3000,
gamma=1)
Thus for each different image I need to process these two values that can sightly change. Since the number of images I need to generate is huge, It is impossible to do that manually. I do not know how to overcome this problem and I cannot find any answer to that problem. An idea would be to find the minimal value of the image and the maximum value. But I did not find any function that allows to do that on the Javascript or python API.
I hope that someone will be able to help me.
You can use img.reduceRegion() to get image statistics for the region you want and for each image to export. You will have to call the results of the region reduction into the visualization function. Here is an example:
geom = ee.Geometry.Polygon([[39.05789266, 13.59051553],
[39.11335033, 13.59051553],
[39.11335033, 13.64477783],
[39.05789266, 13.64477783],
[39.05789266, 13.59051553]])
dataset = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')\
.filterBounds(geom)\
.filterDate('2016-01-01', '2016-12-31')\
.select(['B4', 'B3', 'B2'])
reduction = dataset.median()
stats = reduction.reduceRegion(reducer=ee.Reducer.minMax(),geometry=geom,scale=100,bestEffort=True)
statDict = stats.getInfo()
prettyImg = reduction.visualize(bands=['B4', 'B3', 'B2'],
min=[statDict['B4_min'],statDict['B3_min'],statDict['B2_min']]
max=[statDict['B4_max'],statDict['B3_max'],statDict['B2_max']],
gamma=1)
Using this approach, I get an output image like this:
I hope this helps!