How to convert world coordinate to view coordinate in VTK - vtk

In my program(python based), a point need to be converted from world coordinate([x,y,z]) to view coordinate([j,k,t],j and k are between -1 and 1,t is the depth) in VTK. I find the vtkCoordinate class with SetCoordinateSystemToView() method. But it does not work .
coordinate = vtk.vtkCoordinate()
coordinate.SetCoordinateSystemToWorld()
coordinate.SetValue(x,y,z)
coordinate.SetCoordinateSystemToDisplay()
viewCoord=coordinate.GetComputedValue(renderer)
The result is very odd and definitely wrong. There are some methods like GetComputedDisplayValue() or GetComputedViewportValue() that can get the corresponding result from a coordinate system to display or viewport coordinate system, but there is no method like GetComputedViewValue() . Very confused, need help,
thank you.

This works:
import vtk
coordinate = vtk.vtkCoordinate()
coordinate.SetCoordinateSystemToWorld()
coordinate.SetValue(1,2,1)
# test:
from vedo import *
plt = Plotter()
print("press shift-I on the red dot, then press q")
plt.show(Cube(), Point([1,2,1], r=20), axes=1)
viewCoord = coordinate.GetComputedViewportValue(plt.renderer)
print(viewCoord) # matches!

There is a method world2ViewportMatrix=GetCompositeProjectionTransformMatrix(aspect,nearz,farz) in class vtkCamera which can get the matrix that convert world coordinates to viewport coordinates.
Then viewPortCoord=world2ViewportMatrix.MultiplyPoint([worldPosition[0],worldPosition[1],worldPosition[2],1])
[viewPortCoord[0]/viewPortCoord[3], viewPortCoord[1]/viewPortCoord[3]] is the viewport coordinate( [-1,1]*[-1,1]),[viewPortCoord[2]/viewPortCoord[3] is the depth([nearZ,farZ])

Related

How to detect an object in an image rather than screen with pyautogui?

I am using pyautogui.locateOnScreen() function to locate elements in chrome and get their x,y coordinates and click them. But at some point I need to take a screenshot of a part of the screen and search for the object I want in this screenshot. Then I get coordinates of it. Is it possible to do it with pyautogui?
My example code:
coord_one = pyautogui.locateOnScreen("first_image.png",confidence=0.95)
scshoot = pyautogui.screenshot(region=coord_one)
coord_two = # search second image in scshoot and if it can be detected get coordinates of it.
If it is not possible with pyautogui, can you advice the easiest-smartest way?
Thanks in advance.
I don't believe there is a built-in direct way to do what you need but the python-opencv library does the job.
The following code sample assumes you have an screen capture you just took "capture.png" and you want to find "logo.png" in that capture, which you know is an subsection of "capture.png".
Minimal example
"""Get bounding box of cropped image from original image."""
import cv2 as cv
import numpy as np
img_rgb = cv.imread(r'res/original.png')
# the cropped image, expected to be smaller
target_img = cv.imread(r'res/crop.png')
_, w, h = target_img.shape[::-1]
res = cv.matchTemplate(img_rgb,target_img,cv.TM_CCOEFF_NORMED)
# with the method used, the date in res are top left pixel coords
min_val, max_val, min_loc, max_loc = cv.minMaxLoc(res)
top_left = max_loc
# if we add to it the width and height of the target, then we get the bbox.
bottom_right = (top_left[0] + w, top_left[1] + h)
cv.rectangle(img_rgb,top_left, bottom_right, 255, 2)
cv.imshow('', img_rgb)
MatchTemplate
From the docs, MatchTemplate "simply slides the template image over the input image (as in 2D convolution) and compares the template and patch of input image under the template image." Under the hood, this offers methods such as square difference to compare the images represented as arrays.
See more
For a more in-depth explanation, check the opencv docs as the code is entirely based off their example.

Changing long, lat values of Polygon coordinates in python

I have a basic shape file of all the us states which can be found here..
shapefile
I am looking to edit the positions of the 2 states hawaii and alaska, i wish to change the coordinates of the state of hawaii so that it roughly sits under the state of nevada, and i also with to change the state of alaska so that is is considerably smaller.. and also so it sits roughly below both the state of california and Arizona, il include an image just so theres a visual of my idea..
as you can see alaska and hawaii are sitting on the bottom left of the large us mainland just under the states mentioned before.
I know for this to happen i need to change the longitude and latitude coordinates of both states using geopandas etc.
So i started off with the state of hawaii and began accessing the polygons coordinates using numpy.
here is a snippet of the code so far
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import matplotlib.pyplot as plt
from shapely.geometry import Polygon
from shapely.geometry import Point, Polygon
import numpy as np
poly_States = gpd.read_file("states.shp")
hawaii = poly_States[poly_States.STATE_ABBR == "HI"]
coords = [i for i in hawaii.geometry]
all_Coords = []
for b in coords[0].boundary:
coords = np.dstack(b.coords.xy).tolist()
all_Coords.append(*coords)
for cord_1 in all_Coords:
for cord2 in cord_1:
cord2[0] = cord2[0] + 54.00000000000000
my idea here was to access the coordinates in array format and change the latitude coordinates by adding 54, so basically shifting the entire state to the right to have it sitting rougly under new mexico.
my issue lies in actually returning theses changes to the polygon object in the shapefile itself.
i feel like there is probably an easier method maybe by accessing attributes of the polygon or using some sort of external software, but i believe that if im able to properly access the long,lat values and change them i should be able to make the changes in positioning and size that i need.
Thanks in advance.
You can use translate and assign the new geometry like this:
m = poly_States.STATE_ABBR == "HI"
poly_States[m] = poly_States[m].set_geometry(poly_States[m].translate(54))
Result:
The same way you can scale and shift Alaska:
m = poly_States.STATE_ABBR == "AK"
poly_States[m] = poly_States[m].set_geometry(poly_States[m].scale(.2,.2,.2).translate(40, -40))

Transform Plates into Horizontal Using Hough transform

I am trying to transform images that are not horizontal, because they may be slanted.
It turns out that when testing 2 images, this photo that is horizontal, and this one that is not. It gives me good results with the horizontal photo, however when trying to change the second photo that is tilted, it does not do what was expected.
The fist image it's works fine like below with a theta 1.6406095. For now it looks bad because I'm trying to make the 2 photos look horizontally correct.
The second image say that theta is just 1.9198622
I think the error it is at this line:
lines= cv2.HoughLines(edges, 1, np.pi/90.0, 60, np.array([]))
I have done a little simulation on this link with colab.
Any help is welcome.
So far this is what I got.
import cv2
import numpy as np
img=cv2.imread('test.jpg',1)
imgGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
imgBlur=cv2.GaussianBlur(imgGray,(5,5),0)
imgCanny=cv2.Canny(imgBlur,90,200)
contours,hierarchy =cv2.findContours(imgCanny,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
rectCon=[]
for cont in contours:
area=cv2.contourArea(cont)
if area >100:
#print(area) #prints all the area of the contours
peri=cv2.arcLength(cont,True)
approx=cv2.approxPolyDP(cont,0.01*peri,True)
#print(len(approx)) #prints the how many corner points does the contours have
if len(approx)==4:
rectCon.append(cont)
#print(len(rectCon))
rectCon=sorted(rectCon,key=cv2.contourArea,reverse=True) # Sort out the contours based on largest area to smallest
bigPeri=cv2.arcLength(rectCon[0],True)
cornerPoints=cv2.approxPolyDP(rectCon[0],0.01*peri,True)
# Reorder bigCornerPoints so I can prepare it for warp transform (bird eyes view)
cornerPoints=cornerPoints.reshape((4,2))
mynewpoints=np.zeros((4,1,2),np.int32)
add=cornerPoints.sum(1)
mynewpoints[0]=cornerPoints[np.argmin(add)]
mynewpoints[3]=cornerPoints[np.argmax(add)]
diff=np.diff(cornerPoints,axis=1)
mynewpoints[1]=cornerPoints[np.argmin(diff)]
mynewpoints[2]=cornerPoints[np.argmax(diff)]
# Draw my corner points
#cv2.drawContours(img,mynewpoints,-1,(0,0,255),10)
##cv2.imshow('Corner Points in Red',img)
##print(mynewpoints)
# Bird Eye view of your region of interest
pt1=np.float32(mynewpoints) #What are your corner points
pt2=np.float32([[0,0],[300,0],[0,200],[300,200]])
matrix=cv2.getPerspectiveTransform(pt1,pt2)
imgWarpPers=cv2.warpPerspective(img,matrix,(300,200))
cv2.imshow('Result',imgWarpPers)
Now you just have to fix the tilt (opencv has skew) and then use some threshold to detect the letters and then recognise each letter.
As for a general purpose, I think images need to be normalised first so that we can easily detect the edges.

How do I extract each road in terms of the pixel coordinates from Google Map Screenshot and place them into different lists?

I'm working on a project related to road recognition from a standard Google Map view. Some navigation features will be added to the project later on.
I already extracted all the white pixels (representing road on the map) according to the RGB criteria. Also, I stored all the white pixel (roads) coordinates (2D) in one list named "all_roads". Now I want to extract each road in terms of the pixel coordinates and place them into different lists (one road in one list), but I'm lacking ideas.
I'd like to use Dijkstra's algorithm to calculate the shortest path between two points, but I need to create "nodes" on each road intersection. That's why I'd like to store each road in the corresponding list for further processing.
I hope someone could provide some ideas and methods. Thank you!
Note: The RGB criteria ("if" statements in "threshold" method) seems unnecessary for the chosen map screenshot, but it becomes useful in some other map screenshot with other road colours other than white. (NOT the point of the question anyway but I hope to avoid unnecessary confusion)
# Import numpy to enable numpy array
import numpy as np
# Import time to handle time-related task
import time
# Import mean to calculate the averages of the pixals
from statistics import mean
# Import cv2 to display the image
import cv2 as cv2
def threshold(imageArray):
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Purpose: Display a given image with road in white according to pixel RGBs
Argument(s): A matrix generated from a given image.
Return: A matrix of the same size but only displays white and black.
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
newAr = imageArray
for eachRow in newAr:
for eachPix in eachRow:
if eachPix[0] == 253 and eachPix[1] == 242:
eachPix[0] = 255
eachPix[1] = 255
eachPix[2] = 255
else:
pass
return newAr
# Import the image
g1 = cv2.imread("1.png")
# fix the output image with resolution of 800 * 600
g1 = cv2.resize(g1,(800,600))
# Apply threshold method to the imported image
g2 = threshold(g1)
index = np.where(g2 == [(255,255,255)])
# x coordinate of the white pixels (roads)
print(index[1])
# y coordinate of the white pixels (roads)
print(index[0])
# Storing the 2D coordinates of white pixels (roads) in a list
all_roads = []
for i in range(len(index[0]))[0::3]:
all_roads.append([index[1][i], index[0][i]])
#Display the modified image
cv2.imshow('g2', g2)
cv2.waitKey(0)
cv2.destroyAllWindows()

Why does this code slow down? Graphics.py?

I have some code that reads a small BMP (128x96) file and puts the RGB values into a list.
I then run a nested loop and read the RGB values in reverse from the list and draw them on the screen.
It starts quite quickly and draws the first 20 lines in a second, but progressively slows down to such an extent I've never seen it finish. It only a small 128x96 image.
I feel it's the calls to the graphics.py library, buy why, or is it something else?
I'm running this on a raspberry pi, if that's of use. Python 3.4.2
If your interested in trying you can find the supporting files here https://drive.google.com/open?id=1yM9Vn1Nugnu79l1UNShamEAGd2VWF3T4
(It's the graphics.py library I'm using and the tiny bmp file, also the actual file in question called SlowDownWhy.py)
import math
import sys
from graphics import *
from PIL import Image
# Initialise Vars for Image width n height
iw=0
ih=0
img=Image.open("ani1.bmp","r") # Open Image
iw, ih = img.size # Set image width n height
ch = int(1000/ih) # Cube height set
cw = ch # Cube width set
win = GraphWin("My Window", iw*cw, ih*ch)
win.setBackground(color_rgb(128,128,128))
#Transfer Bitmap RGB vales to csv list - 'RGBlist'
pix_val = list(img.getdata())
RGBlist = [x for sets in pix_val for x in sets]
noe = (iw * ih * 3)-3
x = iw
y = ih
for vy in list(range(ih)):
y = y-1
x = iw
for vx in list(range(iw)):
x = x-1
r=RGBlist[noe]
g=RGBlist[noe+1]
b=RGBlist[noe+2]
noe=noe-3
cx=x*cw
cy=y*ch
aPoint = Rectangle(Point(cx,cy), Point(cx+cw,cy+ch))
aPoint.setFill(color_rgb(r,g,b))
aPoint.draw(win)
It should create a window no bigger than 1000 pixels in height and start drawing the picture from the bottom right to the top left, line by line. but slows down progressively.
Ignoring the invalid syntax, this is simply because of the way graphics.py is programmed: It is not designed to handle this many objects put onto the screen. (It uses tkinter in the back-end, which will slow down with 128*96=12,288 objects). For rendering images, you should either directly integrate them or use another library, as example pygame.
To integrate it into the graphics.py program, there is the Image-class, which you overwrote with the PIL.Image-library (this is the reason why you never do import *). Look here: Importing custom images into graphics.py

Resources