I have an app that needs to render frames from a video/movie into a CGBitmapContext with an arbitrary CGAffineTransform. I'd like it to have a decent frame rate, like 20fps at least.
I've tried using AVURLAsset and [AVAssetImageGenerator copyCGImageAtTime:], and as the documentation for this method clearly states, it's quite slow, taking me down to 5fps sometimes.
What is a better way to do this? I'm THINKING that I could set up an AVPlayer with an AVPlayerLayer, then use [CGLayer renderInContext:] with my transform. Would this work? Or perhaps does a AVPlayerLayer not run when it notices that it's not being shown on the screen?
Any other ways to suggest?
I ended up getting lovely, quick UIImages from the frames of a video by:
1) Creating an AVURLAsset with the video's URL.
2) Creating an AVAssetReader with the asset.
3) Setting the readers's timeRange property.
4) Creating an AVAssetReaderTrackOutput with the first track from the asset.
5) Adding the output to the reader.
Then for each frame:
6) Calling [output copyNextSampleBuffer].
7) Passing the sample buffer into CMSampleBufferGetImageBuffer.
8) Passing the image buffer into CVPixelBufferLockBaseAddress, read-only
9) Getting the base address of the image buffer with CVPixelBufferGetBaseAddress
10) Calling CGBitmapContextCreate with dimensions from the image buffer, passing the base address in as the location of the CGBitmap's pixels.
11) Calling CGBitmapContextCreateImage to get the CGImageRef.
I was very pleased to find that this works surprisingly well for scrubbing. If the user wants to go back to an earlier part of the video, simply create a new AVAssetReader with the new time range and go. It's quite fast!
Related
I'm working to generate an SVG image to represent a graph. For each node, I would like to display an image. As written in the documentation, to use an image, I need to use svgaddfile and svgaddimage.
I wrote this code (I copy only the interesting lines)
svgsetgraphviewbox(0, 0,max(i in V_zero_n_plus_one)X(i)+10, max(i in V_zero_n_plus_one)Y(i)+10)
svgsetgraphscale(5)
svgsetgraphpointsize(5)
svgaddgroup("Customers", "Customers", SVG_BLACK)
svgaddgroup("Depot", "Depot", SVG_BROWN)
svgaddpoint(X(0), Y(0))
svgaddtext(X(0)+0.5, Y(0)-0.5, "Depot")
svgaddfile("./city2.jpg", "city.png")
svgaddimage("city.png", X(0)+0.5, Y(0)-0.5, 20, 20)
svgaddgroup("Routes", "Delivery routes")
svgsave("vrp.svg")
svgrefresh
svgwaitclose("Close browser window to terminate model execution.", 1)
I obtain the following image:
The image is 512x512. What am I doing wrong? Tnx
There seems to be a timing issue for the uploading of the graphic file when you are using the option '1' in 'svgwaitclose' when running from Workbench (this option means that the underlying HTTP server that is run by mmsvg is stopped immediately once the SVG file has been uploaded).
You could either work with this form:
svgwaitclose("Close browser window to terminate model execution.") ! NB: the second argument defaults to value 0
or add a small delay before this statement:
sleep(2000) ! Wait for 2 seconds
svgwaitclose("Close browser window to terminate model execution.", 1)
The purpose is to take data from a virtual camera (from a camera in Gazebo simulation, updating every second) and use Detectron2 (requires data come from cv2.VideoCapture) to recognize other objects in the simulation. The virtual camera of course does not appear in lspci so I can't simply use cv2.VideoCapture(0).
So my code is
bridge = CvBridge()
cv_image = bridge.imgmsg_to_cv2(data, desired_encoding='bgr8') #cv_image is numpy.ndarray, size (100,100,3)
cap = cv2.VideoCapture()
ret, frame = cap.read(image=cv_image)
print(ret, frame)
but it just prints False None, I assume because there's nothing being captured in cap. I
f I replace line 2 with cap = cv2.VideoCapture(cv_image) I get the error,
TypeError: only size-1 arrays can be converted to Python scalars
since I believe it requires either and integer (representing webcam number) or string (representing video file).
And for reference,
cv_image = bridge.imgmsg_to_cv2(data, desired_encoding='bgr8') # cv_image is numpy.ndarray
cv2.imshow('image', cv_image)
cv2.waitKey(1)
displays the image perfectly fine. Could there be a way to use imshow() or something similar as input for VideoCapture()?
However, cap = cv2.VideoCapture(cv2.imshow('image', cv_image))opens a blank window and gives me,
[ERROR:0] global /io/opencv/modules/videoio/src/cap.cpp (116) open VIDEOIO(CV_IMAGES): raised OpenCV exception:
OpenCV(4.2.0) /io/opencv/modules/videoio/src/cap_images.cpp:293: error: (-215:Assertion failed) !_filename.empty() in function 'open'
How can I create a cv2.VideoCapture() object that can use the image data that I have? Or what's something that might point me in the right direction?
Ubuntu 18.04 and Python 3.6 with opencv-python 4.2.0.34
From what I found on Gazebo tutorials page:
In Rviz, add a ''Camera'' display and under ''Image Topic'' set it to /rrbot/camera1/image_raw.
In your case it probably won't be /rrbot/camera1/ name, but the one you are setting in .gazebo file
<cameraName>rrbot/camera1</cameraName>
<imageTopicName>image_raw</imageTopicName>
<cameraInfoTopicName>camera_info</cameraInfoTopicName>
So you can create subscriber and use cv2.VideoCapture() for every single image from that topic.
My solution was to rewrite Detectron2's --input flag in the demo to constantly run a ROS2 callback with demo.run_on_image(cv_data). So instead of making it process video, it just quickly processes each new image one at a time. This is a workaround so that cv2.VideoCapture() is not needed.
The intended way to take a screenshot via Splinter is pretty straightforward, and I understand that in the context of mimicking a web-browser a screenshot basically means saving an image to a file, but I was wondering if I could throw away that IO concern by directly reading the screenshot into a Python PIL object when I invoke browser.screenshot() . The reason for this is that I would perform some processing on the image regardless so saving it to disk and reading it from disk seems like a step I could short-circuit.
browser = Browser()
screenshot_path = browser.screenshot('absolute_path/your_screenshot.png')
Something like
screenshot_pil = browser.screenshot('path_to', inmemory=True)
Not sure if I missed this in the documentation, but there is a function screenshot_as_png() that seems to do what I want but I'm not sure how to access it through the namespace of a Browser object
I am able to use the moviepy library to add a watermark to a section of video. However when I do this it is taking the watermarked segment, and creating a new file with it. I am trying to figure out if it is possible to simply splice in the edited part back into the original video, as moviepy is EXTREMELY slow writing to the disk, so the smaller the segment the better.
I was thinking maybe using shutil?
video = mp.VideoFileClip("C:\\Users\\admin\\Desktop\\Test\\demovideo.mp4").subclip(10,20)
logo = (mp.ImageClip("C:\\Users\\admin\\Desktop\\Watermark\\watermarkpic.png")
.set_duration(20)
.resize(height=20) # if you need to resize...
.margin(right=8, bottom=8, opacity=0) # (optional) logo-border padding
.set_pos(("right","bottom")))
final = mp.CompositeVideoClip([video, logo])
final.write_videofile("C:\\Users\\admin\\Desktop\\output\\demovideo(watermarked).mp4", audio = True, progress_bar = False)
Is there a way to copy the 10 second watermarked snippet back into the original video file? Or is there another library that allows me to do this?
What is slow in your use case is the fact that Moviepy needs to decode and reencode each frame of the movie. If you want speed, I believe there are ways to ask FFMPEG to copy video segments without rencoding.
So you could use ffmpeg to cut the video into 3 subclips (before.mp4/fragment.mp4/after.mp4), only process fragment.mp4, then reconcatenate all clips together with ffmpeg.
The cutting into 3 clips using ffmpeg can be done from moviepy:
https://github.com/Zulko/moviepy/blob/master/moviepy/video/io/ffmpeg_tools.py#L27
However for concatenating everything together you may need to call ffmpeg directly.
Inforamtion:
I have a simple ToF(Time of Flight) camera module provided by a vendor that only contains a Depth Node.
I've already setup the PCL environment and can compile and execute the sample code it provides.
The ToF camera module comes with a source code shows how to get depth raw data(the x, y, z value) from the hard device, but doesn't tell how to stream it as both point cloud image and depth image.
Win 7 64bit, Visual Studio 2008, PCL all-in-one 32bit.
As a result, I plan to use PCL to show the Point cloud image and depth image with the x, y, z data I can get from that camera module, further more, if streaming is possible.
However, as far as I know right now is that PCL tends to store all the point cloud data as a .pcd file, and then reads it thus output a point cloud image and a depth image.
It is obviously too slow to do the streaming in such way if I have to savePCD() and readPCD() every time in each frame. So I studied the "openni grabber" and "openni range image visualization" sample code and tried execute them, sadly "No device connected." is all I got.
I have a few ideas to ask for advises before I try:
Is there a way to use Openni on a device except Kinect, Xtion and PrimeSense? Even if it's just a device with no name and only has a depth node?
Can PCL show point cloud image and depth image without accessing a .pcd file? In other words, can I just assign the value of each vertex and construct a image?
Can I just normalize all the vertices and construct a image with barely Opencv?
Is there any other method to stream that ToF camera module in point cloud image and depth image?
1) Changing the OpenNI grabber to use your own ToF camera will be much more work than to just use the example from the camera in a loop shown below.
2) Yes PCL can show point cloud image and depth without accessing a .pcd file. What the .pcd loader does is to parse the pcd-file and place the values in the cloud format. You can do this directly from your camera data as shown below.
3) No idea what you mean here. I propose you try to use the pcl visualizer or cloud viewer as proposed below.
You can do something like:
pcl::PointCloud<pcl::PointXYZ>::Ptr cloud (new pcl::PointCloud<pcl::PointXYZ>);
cloud->isDense = true;
cloud->width = widthOfTOFsensor;
cloud->height = heightOfTOFsensor;
cloud->size = cloud->width * cloud->height;
//Create some loop
//grabNewFrame from TOFsensor
for(int pointIndex=0;pointIndex<cloud->size();pointIndex++)
{
cloud->points[pointIndex].x = tofSensorData[pointIndex].x; //Don't know the tofData format, so I just guessed something.
cloud->points[pointIndex].y = tofSensorData[pointIndex].y;
cloud->points[pointIndex].z = tofSensorData[pointIndex].z;
}
// Plot the data using pcl visualizer or cloud viewer, see:
http://pointclouds.org/documentation/tutorials/cloud_viewer.php#cloud-viewer
http://pointclouds.org/documentation/tutorials/pcl_visualizer.php#pcl-visualizer