Creating cv2.VideoCapture() object directly from numpy array image data - python-3.x

The purpose is to take data from a virtual camera (from a camera in Gazebo simulation, updating every second) and use Detectron2 (requires data come from cv2.VideoCapture) to recognize other objects in the simulation. The virtual camera of course does not appear in lspci so I can't simply use cv2.VideoCapture(0).
So my code is
bridge = CvBridge()
cv_image = bridge.imgmsg_to_cv2(data, desired_encoding='bgr8') #cv_image is numpy.ndarray, size (100,100,3)
cap = cv2.VideoCapture()
ret, frame = cap.read(image=cv_image)
print(ret, frame)
but it just prints False None, I assume because there's nothing being captured in cap. I
f I replace line 2 with cap = cv2.VideoCapture(cv_image) I get the error,
TypeError: only size-1 arrays can be converted to Python scalars
since I believe it requires either and integer (representing webcam number) or string (representing video file).
And for reference,
cv_image = bridge.imgmsg_to_cv2(data, desired_encoding='bgr8') # cv_image is numpy.ndarray
cv2.imshow('image', cv_image)
cv2.waitKey(1)
displays the image perfectly fine. Could there be a way to use imshow() or something similar as input for VideoCapture()?
However, cap = cv2.VideoCapture(cv2.imshow('image', cv_image))opens a blank window and gives me,
[ERROR:0] global /io/opencv/modules/videoio/src/cap.cpp (116) open VIDEOIO(CV_IMAGES): raised OpenCV exception:
OpenCV(4.2.0) /io/opencv/modules/videoio/src/cap_images.cpp:293: error: (-215:Assertion failed) !_filename.empty() in function 'open'
How can I create a cv2.VideoCapture() object that can use the image data that I have? Or what's something that might point me in the right direction?
Ubuntu 18.04 and Python 3.6 with opencv-python 4.2.0.34

From what I found on Gazebo tutorials page:
In Rviz, add a ''Camera'' display and under ''Image Topic'' set it to /rrbot/camera1/image_raw.
In your case it probably won't be /rrbot/camera1/ name, but the one you are setting in .gazebo file
<cameraName>rrbot/camera1</cameraName>
<imageTopicName>image_raw</imageTopicName>
<cameraInfoTopicName>camera_info</cameraInfoTopicName>
So you can create subscriber and use cv2.VideoCapture() for every single image from that topic.

My solution was to rewrite Detectron2's --input flag in the demo to constantly run a ROS2 callback with demo.run_on_image(cv_data). So instead of making it process video, it just quickly processes each new image one at a time. This is a workaround so that cv2.VideoCapture() is not needed.

Related

how do i move mouse to texture coordinates?

im using python 3.10.5
and heres my code
import pyautogui
target = pyautogui.locateCenterOnScreen('target.png')
print(target)
pyautogui.moveTo(target)
but for some reason it just prints None
and doesnt move the mouse to the images coordinates
The fact that it prints None means it didn't find the image. Check the image you are trying to find (is it properly cropped?) or try setting the confidence parameter to make a match more likely.
pyautogui.locateCenterOnScreen('target.png', confidence=x)
# x can be anywhere between 1 and 0, the lower the more likely a match
import pyautogui
target = pyautogui.locateCenterOnScreen('target.png', confidence = 0.5)
#start at 0.5 and then scale as needed.
print(target.x,target.y)
pyautogui.moveTo(target.x,target.y)
when returning location from locateCenterOnScreen, it needs coordinates, I use this one extensively, with locateonscreen to check for validity first, then the former for actual utility. I use .sleep() extensively as the target applications usually don't respond at computer speed.

Problems Converting Numpy/OpenCV Array Image into a Wand Image

I'm currently trying to perform a Polar to Cartesian Coordinate Image transformation, to display a raw sonar image into a 'fan-display'.
Initially I have a Numpy Array image of type np.float64, that can be seen below:
After doing some searching, I came across this StackOverflow post Inverse transform an image from Polar to Cartesian in OpenCV with a very similar problem, in which the poster seemed to have solved his/her issue by using the Python Wand library (http://docs.wand-py.org/en/0.5.9/index.html), specifically using their set of Distortion functions.
However, when I tried to use Wand and read the image in, I instead ended up with Wand getting the image below, which seems to be smaller than the original one. However, the weird thing is that img.size still gives the same size number as the original image's shape.
The code for this transformation can be seen below:
print(raw_img.shape)
wand_img = Image.from_array(raw_img.astype(np.uint8), channel_map="I") #=> (369, 256)
display(wand_img)
print("Current image size", wand_img.size) #=> "Current image size (369, 256)"
This is definitely quite problematic as Wand will automatically give the wrong 'fan image'. Is anybody familiar with this kind of problem with the Wand library previously, and if yes, may I ask what is the recommended solution to fix this issue?
If this issue isn't resolved soon I have an alternative backup of using OpenCV's cv::remap function (https://docs.opencv.org/4.1.2/da/d54/group__imgproc__transform.html#ga5bb5a1fea74ea38e1a5445ca803ff121). However the problem with this is that I'm not sure what mapping arrays (i.e. map_x and map_y) to use to perform the Polar->Cartesian transformation, as using a mapping matrix that implements the transformation equations below:
r = polar_distances(raw_img)
x = r * cos(theta)
y = r * sin(theta)
didn't seem to work and instead threw out errors from OpenCV as well.
Any kind of help and insight into this issue is greatly appreciated. Thank you!
- NickS
EDIT I've tried on another image example as well, and it still shows a similar problem. So first, I imported the image into Python using OpenCV, using these lines of code:
import matplotlib.pyplot as plt
from wand.image import Image
from wand.display import display
import cv2
img = cv2.imread("Test_Img.jpg")
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.figure()
plt.imshow(img_rgb)
plt.show()
which showed the following display as a result:
However, as I continued and tried to open the img_rgb object with Wand, using the code below:
wand_img = Image.from_array(img_rgb)
display(img_rgb)
I'm getting the following result instead.
I tried to open the image using wand.image.Image() on the file directly, which is able to display the image correctly when using display() function, so I believe that there isn't anything wrong with the wand library installation on the system.
Is there a missing step that I required to convert the numpy into Wand Image that I'm missing? If so, what would it be and what is the suggested method to do so?
Please do keep in mind that I'm stressing the conversion of Numpy to Wand Image quite crucial, the raw sonar images are stored as binary data, thus the required use of Numpy to convert them to proper images.
Is there a missing step that I required to convert the numpy into Wand Image that I'm missing?
No, but there is a bug in Wand's Numpy implementation in Wand 0.5.x. The shape of OpenCV's ndarray is (ROWS, COLUMNS, CHANNELS), but Wand's ndarray is (WIDTH, HEIGHT, CHANNELS). I believe this has been fixed for the future 0.6.x releases.
If so, what would it be and what is the suggested method to do so?
Swap the values in img_rgb.shape before passing to Wand.
img_rgb.shape = (img_rgb.shape[1], img_rgb.shape[0], img_rgb.shape[2],)
with Image.from_array(img_rgb) as img:
display(img)

Issue with the resize function while performing thresholding in opencv

I am looking to implement a project of computer vision in which I am having issues with the resize function while I am thresholding an image for further processing.
I have already tried various solutions provided on internet regarding this problem but none of them is working for my case. I have even tried to update the version of my opencv library but this has also not given any fruitful results.
Code, which results in an error:
threshold_eye = cv2.resize(threshold_eye, None, fx=5, fy=5, interpolation=cv2.INTER_AREA)
Error:
cv2.error: OpenCV(4.1.0) C:\projects\opencv-python\opencv\modules\imgproc\src\resize.cpp:3718: error: (-215:Assertion failed) !ssize.empty() in function 'cv::resize'
I am expecting, if someone can guide me about how can I solve this error or if there is any alternative method to perform the same task which I want to perform.
in the line:
threshold_eye = cv2.resize(threshold_eye, None, fx=5, fy=5, interpolation=cv2.INTER_AREA)
second parameter of function is "dsize" i.e output size.
I see that you have written "None" instead mention output image size for e.g op_dim = (width, height).
That should work.

How to stream depth image from a basic ToF camera module with Point Cloud Library(PCL)

Inforamtion:
I have a simple ToF(Time of Flight) camera module provided by a vendor that only contains a Depth Node.
I've already setup the PCL environment and can compile and execute the sample code it provides.
The ToF camera module comes with a source code shows how to get depth raw data(the x, y, z value) from the hard device, but doesn't tell how to stream it as both point cloud image and depth image.
Win 7 64bit, Visual Studio 2008, PCL all-in-one 32bit.
As a result, I plan to use PCL to show the Point cloud image and depth image with the x, y, z data I can get from that camera module, further more, if streaming is possible.
However, as far as I know right now is that PCL tends to store all the point cloud data as a .pcd file, and then reads it thus output a point cloud image and a depth image.
It is obviously too slow to do the streaming in such way if I have to savePCD() and readPCD() every time in each frame. So I studied the "openni grabber" and "openni range image visualization" sample code and tried execute them, sadly "No device connected." is all I got.
I have a few ideas to ask for advises before I try:
Is there a way to use Openni on a device except Kinect, Xtion and PrimeSense? Even if it's just a device with no name and only has a depth node?
Can PCL show point cloud image and depth image without accessing a .pcd file? In other words, can I just assign the value of each vertex and construct a image?
Can I just normalize all the vertices and construct a image with barely Opencv?
Is there any other method to stream that ToF camera module in point cloud image and depth image?
1) Changing the OpenNI grabber to use your own ToF camera will be much more work than to just use the example from the camera in a loop shown below.
2) Yes PCL can show point cloud image and depth without accessing a .pcd file. What the .pcd loader does is to parse the pcd-file and place the values in the cloud format. You can do this directly from your camera data as shown below.
3) No idea what you mean here. I propose you try to use the pcl visualizer or cloud viewer as proposed below.
You can do something like:
pcl::PointCloud<pcl::PointXYZ>::Ptr cloud (new pcl::PointCloud<pcl::PointXYZ>);
cloud->isDense = true;
cloud->width = widthOfTOFsensor;
cloud->height = heightOfTOFsensor;
cloud->size = cloud->width * cloud->height;
//Create some loop
//grabNewFrame from TOFsensor
for(int pointIndex=0;pointIndex<cloud->size();pointIndex++)
{
cloud->points[pointIndex].x = tofSensorData[pointIndex].x; //Don't know the tofData format, so I just guessed something.
cloud->points[pointIndex].y = tofSensorData[pointIndex].y;
cloud->points[pointIndex].z = tofSensorData[pointIndex].z;
}
// Plot the data using pcl visualizer or cloud viewer, see:
http://pointclouds.org/documentation/tutorials/cloud_viewer.php#cloud-viewer
http://pointclouds.org/documentation/tutorials/pcl_visualizer.php#pcl-visualizer

Drawing frames from a movie into a CGBitmapContext

I have an app that needs to render frames from a video/movie into a CGBitmapContext with an arbitrary CGAffineTransform. I'd like it to have a decent frame rate, like 20fps at least.
I've tried using AVURLAsset and [AVAssetImageGenerator copyCGImageAtTime:], and as the documentation for this method clearly states, it's quite slow, taking me down to 5fps sometimes.
What is a better way to do this? I'm THINKING that I could set up an AVPlayer with an AVPlayerLayer, then use [CGLayer renderInContext:] with my transform. Would this work? Or perhaps does a AVPlayerLayer not run when it notices that it's not being shown on the screen?
Any other ways to suggest?
I ended up getting lovely, quick UIImages from the frames of a video by:
1) Creating an AVURLAsset with the video's URL.
2) Creating an AVAssetReader with the asset.
3) Setting the readers's timeRange property.
4) Creating an AVAssetReaderTrackOutput with the first track from the asset.
5) Adding the output to the reader.
Then for each frame:
6) Calling [output copyNextSampleBuffer].
7) Passing the sample buffer into CMSampleBufferGetImageBuffer.
8) Passing the image buffer into CVPixelBufferLockBaseAddress, read-only
9) Getting the base address of the image buffer with CVPixelBufferGetBaseAddress
10) Calling CGBitmapContextCreate with dimensions from the image buffer, passing the base address in as the location of the CGBitmap's pixels.
11) Calling CGBitmapContextCreateImage to get the CGImageRef.
I was very pleased to find that this works surprisingly well for scrubbing. If the user wants to go back to an earlier part of the video, simply create a new AVAssetReader with the new time range and go. It's quite fast!

Resources