How to get the world coordinates using VTK voxel index? - vtk

I’m a beginner of ITK and VTK.Now I want to use the 3D region growing algorithms to segment specific tissues.I have finished the vulume rendering and could get the voxel index by using vtkImagePlaneWidget.But How to get the world coordinate of the index?

vtkImageData exposes a method called TransformIndexToPhysicalPoint which converts voxel indices to world coordinate.
Called from python, it looks something like this
import vtk
img = vtk.vtkImageData()
img.SetDimensions([256,256,256])
img.SetSpacing([0.5,0.5,0.5])
img.SetOrigin([100,100,100])
physical_coord = [0,0,0] # Placeholder
index_coord = [128,128, 128]
img.TransformContinuousIndexToPhysicalPoint(index_coord, physical_coord)
assert tuple(phyiscal_coord) == (164,164,164)

Related

How to read specific keypoints in COCOEval

I need to calculate the mean average precision (mAP) of specific keypoints (and not for all keypoints, as it done by default).
Here's my code :
from pycocotools.coco import COCO
from pycocotools.cocoeval import COCOeval
# https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoEvalDemo.ipynb
cocoGt = COCO('annotations/person_keypoints_val2017.json') # initialize COCO ground truth api
cocoDt = cocoGt.loadRes('detections/results.json') # initialize COCO pred api
cat_ids = cocoGt.getCatIds(catNms=['person'])
imgIds = cocoGt.getImgIds(catIds=cat_ids)
cocoEval = COCOeval(cocoGt, cocoDt, 'keypoints')
cocoEval.params.imgIds = imgIds
cocoEval.evaluate()
cocoEval.accumulate()
cocoEval.summarize()
print(cocoEval.stats[0])
This code prints the mAP for all keypoints ['nose', ...,'right_ankle'] but I need only for few specific keypoints like ['nose', 'left_hip', 'right_hip']
I recently solved this and evaluated only the 13 key points, leaving behind the eyes and the ears as per my application.
Just open the cocoeval.py under pycocotools, then head over to the computeOKS function, where you will encounter two sets of keypoints—ground truth keypoints—and detection keypoints, such as a NumPy array.
Make sure to do proper slicing for that 51 array size Python lists.
For example, if you wish to only check the mAP for nose, the slicing would be as follows:
g= np.array(gt['keypoints'][0:3])
Similarly, do it for a dt array.
Also, set the sigma values of those unwanted key points to 0.
You are all set!

OSMNx : get coordinates of nodes/corners/edges of polygons/buildings

I am trying to retrieve the coordinates of all nodes/corners/edges of each commercial building in a list. E.g. for the supermarket Aldi in Macclesfield (UK), I can get from the UI 10 nodes (all the corners/edges of the supermarket) but I can only retrieve from osmnx 2 of those 10 nodes. I would need to access to the complete list of nodes but it truncates the results giving only 2 nodes of 10 in this case.Using this code below:
import osmnx as ox
test = ox.geocode_to_gdf('aldi, Macclesfield, Cheshire, GB')
ax = ox.project_gdf(test).plot()
test.geometry
or
gdf = ox.geometries_from_place('Grosvenor, Macclesfield, Cheshire, GB', tags)
gdf.geometry
Both return just two coordinates and truncate other info/results that is available in openStreetMap UI (you can see it in the first column of the image attached geometry>POLYGON>only two coordinates and other results truncated...). I would appreciate some help on this, thanks in advance.
It's hard to guess what you're doing here because you didn't provide a reproducible example (e.g., tags is undefined). But I'll try to guess what you're going for.
I am trying to retrieve the coordinates of all nodes/corners/edges of commercial buildings
Here I retrieve all the tagged commercial building footprints in Macclesfield, then extract the first one's polygon coordinates. You could instead filter these by other attribute values as you see fit if you only want certain kinds of buildings. Proper usage of OSMnx's geometries module is described in the documentation.
import osmnx as ox
# get the building footprints in Macclesfield
place = 'Macclesfield, Cheshire, England, UK'
tags = {'building': 'commercial'}
gdf = ox.geometries_from_place(place, tags)
# how many did we get?
print(gdf.shape) # (57, 10)
# extract the coordinates for the first building's footprint
gdf.iloc[0]['geometry'].exterior.coords
Alternatively, if you want a specific building's footprint, you can look up its OSM ID and tell OSMnx to geocode that value:
gdf = ox.geocode_to_gdf('W251154408', by_osmid=True)
polygon = gdf.iloc[0]['geometry']
polygon.exterior.coords
gdf = ox.geocode_to_gdf('W352332709', by_osmid=True)
polygon = gdf.iloc[0]['geometry']
polygon.exterior.coords
list(polygon.exterior.coords)

How to convert world coordinate to view coordinate in VTK

In my program(python based), a point need to be converted from world coordinate([x,y,z]) to view coordinate([j,k,t],j and k are between -1 and 1,t is the depth) in VTK. I find the vtkCoordinate class with SetCoordinateSystemToView() method. But it does not work .
coordinate = vtk.vtkCoordinate()
coordinate.SetCoordinateSystemToWorld()
coordinate.SetValue(x,y,z)
coordinate.SetCoordinateSystemToDisplay()
viewCoord=coordinate.GetComputedValue(renderer)
The result is very odd and definitely wrong. There are some methods like GetComputedDisplayValue() or GetComputedViewportValue() that can get the corresponding result from a coordinate system to display or viewport coordinate system, but there is no method like GetComputedViewValue() . Very confused, need help,
thank you.
This works:
import vtk
coordinate = vtk.vtkCoordinate()
coordinate.SetCoordinateSystemToWorld()
coordinate.SetValue(1,2,1)
# test:
from vedo import *
plt = Plotter()
print("press shift-I on the red dot, then press q")
plt.show(Cube(), Point([1,2,1], r=20), axes=1)
viewCoord = coordinate.GetComputedViewportValue(plt.renderer)
print(viewCoord) # matches!
There is a method world2ViewportMatrix=GetCompositeProjectionTransformMatrix(aspect,nearz,farz) in class vtkCamera which can get the matrix that convert world coordinates to viewport coordinates.
Then viewPortCoord=world2ViewportMatrix.MultiplyPoint([worldPosition[0],worldPosition[1],worldPosition[2],1])
[viewPortCoord[0]/viewPortCoord[3], viewPortCoord[1]/viewPortCoord[3]] is the viewport coordinate( [-1,1]*[-1,1]),[viewPortCoord[2]/viewPortCoord[3] is the depth([nearZ,farZ])

How do I extract each road in terms of the pixel coordinates from Google Map Screenshot and place them into different lists?

I'm working on a project related to road recognition from a standard Google Map view. Some navigation features will be added to the project later on.
I already extracted all the white pixels (representing road on the map) according to the RGB criteria. Also, I stored all the white pixel (roads) coordinates (2D) in one list named "all_roads". Now I want to extract each road in terms of the pixel coordinates and place them into different lists (one road in one list), but I'm lacking ideas.
I'd like to use Dijkstra's algorithm to calculate the shortest path between two points, but I need to create "nodes" on each road intersection. That's why I'd like to store each road in the corresponding list for further processing.
I hope someone could provide some ideas and methods. Thank you!
Note: The RGB criteria ("if" statements in "threshold" method) seems unnecessary for the chosen map screenshot, but it becomes useful in some other map screenshot with other road colours other than white. (NOT the point of the question anyway but I hope to avoid unnecessary confusion)
# Import numpy to enable numpy array
import numpy as np
# Import time to handle time-related task
import time
# Import mean to calculate the averages of the pixals
from statistics import mean
# Import cv2 to display the image
import cv2 as cv2
def threshold(imageArray):
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Purpose: Display a given image with road in white according to pixel RGBs
Argument(s): A matrix generated from a given image.
Return: A matrix of the same size but only displays white and black.
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
newAr = imageArray
for eachRow in newAr:
for eachPix in eachRow:
if eachPix[0] == 253 and eachPix[1] == 242:
eachPix[0] = 255
eachPix[1] = 255
eachPix[2] = 255
else:
pass
return newAr
# Import the image
g1 = cv2.imread("1.png")
# fix the output image with resolution of 800 * 600
g1 = cv2.resize(g1,(800,600))
# Apply threshold method to the imported image
g2 = threshold(g1)
index = np.where(g2 == [(255,255,255)])
# x coordinate of the white pixels (roads)
print(index[1])
# y coordinate of the white pixels (roads)
print(index[0])
# Storing the 2D coordinates of white pixels (roads) in a list
all_roads = []
for i in range(len(index[0]))[0::3]:
all_roads.append([index[1][i], index[0][i]])
#Display the modified image
cv2.imshow('g2', g2)
cv2.waitKey(0)
cv2.destroyAllWindows()

mangle images of vtk from itk

I am reading an image from SimpleITK but I get these results in vtk any help?
I am not sure where things are going wrong here.
Please see image here.
####
CODE
def sitk2vtk(img):
size = list(img.GetSize())
origin = list(img.GetOrigin())
spacing = list(img.GetSpacing())
sitktype = img.GetPixelID()
vtktype = pixelmap[sitktype]
ncomp = img.GetNumberOfComponentsPerPixel()
# there doesn't seem to be a way to specify the image orientation in VTK
# convert the SimpleITK image to a numpy array
i2 = sitk.GetArrayFromImage(img)
#import pylab
#i2 = reshape(i2, size)
i2_string = i2.tostring()
# send the numpy array to VTK with a vtkImageImport object
dataImporter = vtk.vtkImageImport()
dataImporter.CopyImportVoidPointer( i2_string, len(i2_string) )
dataImporter.SetDataScalarType(vtktype)
dataImporter.SetNumberOfScalarComponents(ncomp)
# VTK expects 3-dimensional parameters
if len(size) == 2:
size.append(1)
if len(origin) == 2:
origin.append(0.0)
if len(spacing) == 2:
spacing.append(spacing[0])
# Set the new VTK image's parameters
#
dataImporter.SetDataExtent (0, size[0]-1, 0, size[1]-1, 0, size[2]-1)
dataImporter.SetWholeExtent(0, size[0]-1, 0, size[1]-1, 0, size[2]-1)
dataImporter.SetDataOrigin(origin)
dataImporter.SetDataSpacing(spacing)
dataImporter.Update()
vtk_image = dataImporter.GetOutput()
return vtk_image
###
END CODE
You are ignoring two things:
There is an order change when you perform GetArrayFromImage:
The order of index and dimensions need careful attention during conversion. Quote from SimpleITK Notebooks at http://insightsoftwareconsortium.github.io/SimpleITK-Notebooks/01_Image_Basics.html:
ITK's Image class does not have a bracket operator. It has a GetPixel which takes an ITK Index object as an argument, which is an array ordered as (x,y,z). This is the convention that SimpleITK's Image class uses for the GetPixel method as well.
While in numpy, an array is indexed in the opposite order (z,y,x).
There is a change of coordinates between ITK and VTK image representations. Historically, in computer graphics there is a tendency to align the camera in such a way that the positive Y axis is pointing down. This results in a change of coordinates between ITK and VTK images.

Resources