I am building a project wherein I capture video from webcam or usb camera or from url and perform object detection on the video using machine learning tensorflow API. Everything works fine if I take the input video from webcam or external usb camera but when I take input from IP camera using url the code fails after running for 30-40 seconds.
My code looks like this
import cv2
vid = cv2.VideoCapture(“rtsp://x.xx.xx.xx:554”)
While(True)
_,img = vid.read()
img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
final_img = show_inference(detection_model , img)
final_img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
cv2.imshow(‘frame’, final_img)
If cv2.waitkey(1)
break
Vid.release()
cv2.destroyAllWindows()
This works fine when i execute it with webcam or usb camera using below lines:
cv2.VideoCapture(0) or cv2.VideoCapture(1)
But when i run using url it shows me frame for 30-40 seconds and then fails with the below error
OpenCV(4.4.0)\source\color.cpp:182: error:(-215:Asertion failed)!_src.empty() in function ‘cv::cvtColor’
It appears to me that the opencv library fails to capture live feed from url and then the code fails.
Anyone any idea how to resolve this issue, below are the versions and specifications i am using:
Using Tensorflow 2.0 on i5 machine without gpu
Hikvision PTZ IP camera
Python version 3.7
Opencv version 4.4
code:
check vid.isOpened(). if it isn't, do not read.
say rv, img = vid.read()
check that rv is True, otherwise break loop
are you throttling the reception of frames in any way? does your inference step take much time?
set the camera to a lower FPS value. the camera will produce frames at its own rate. it will not stop or slow down for you.
when you don't read frames, they queue up. they do not disappear. that will eventually cause you to crash or give you other types of failure. you absolutely must consume frames as fast as they are made.
Related
I am working on ubuntu 16.04 and using a USB 2.0 webcam. I want to decrease the frame rate somehow since the project I'm working on requires face detection which really lags the video hence want to decrease the frame rate.
I've tried implementing the following code
import cv2
cap = cv2.VideoCapture(0)
cap.set(cv2.CAP_PROP_FPS, 15)
fps = int(cap.get(5))
print("fps:", fps)
while(cap.isOpened()):
ret,frame = cap.read()
if not ret:
break
cv2.imshow('frame', frame)
k = cv2.waitKey(1)
if k == 27:
break
I get the following error
(python3:24100): GStreamer-CRITICAL **: gst_element_get_state: assertion 'GST_IS_ELEMENT (element)' failed
If i set the frame rate in the above mentioned code to 30 (default frame rate) then i get a proper video, but if I change it I get the above mentioned error.
How can i decrease the frame rate either through code or even manually through settings (if there's a way)
Okay, there is several ways you can do this but I would suggest first checking the capabilities of the webcam. You can do this by installing:
sudo apt-get install v4l-utils
And run:
v4l2-ctl --list-formats-ext
If the desired frame rate is not listed you can increase the value in cv2.waitKey() and time it with time.time() to get the frame rate you want.
When I try to open a webcam (FLIR Boson) with OpenCV on a Jetson TX2 it gives the following error:
libv4l2: error set_fmt gave us a different result then try_fmt!
VIDEOIO ERROR: libv4l unable convert to requested pixfmt
I am using this python script:
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
# Capture frame-by-frame
ret, frame = cap.read()
# Our operations on the frame come here
# Display the resulting frame
cv2.imshow('frame',frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()
Although it does display the video it shows those errors. The reason that is relevant is I am trying to get the FLIR Boson to work with a Jetson TX2 running this program https://github.com/naisy/realtime_object_detection
I have it working with a regular webcam but with the FLIR Boson it gives
libv4l2: error set_fmt gave us a different result then try_fmt!
VIDEOIO ERROR: libv4l unable convert to requested pixfmt
VIDEOIO ERROR: V4L: Initial Capture Error: Unable to load initial memory buffers.
Segmentation fault (core dumped)
the above error and closes. In my research on the error, it seems to come up with people who use webcams that are monochrome, looking at this https://www.flir.com/support-center/oem/is-there-a-way-to-maximize-the-video-display-on-the-boson-app-for-windows-pc-to-full-screen/ I am wondering if I need to configure OpenCV or the V4L2 driver to choose the right format for the webcam to prevent the errors.
I also have a Jetson Xavier and the same object detection program works on it (it just has a different build of OpenCV and Tensorflow), so I am guessing that there is a slightly different configuration related to webcam format compatibility on that OpenCV install on the Xavier VS the TX2. I am new to all of this so forgive me if I ask for more clarification.
One last bit of info, this is out of the FLIR Boson manuel related to USB:
8.2.2 USB
Boson is capable of providing digital data as a USB Video Class (UVC) compliant device. Two output options are provided. Note the options are not selected via the CCI but rather by the video capture or viewing software selected by the user. The options are:
■ Pre-AGC (16-bit): The output is linearly proportional to the flux incident on each pixel in the array; output resolution is 320x256 for the 320 configuration, 640x512 for the 640 configuration. Note that AGC settings, zoom settings, and color-encoding settings have no effect on the output signal at this tap point. This option is identified with a UVC video format 4CC code of “Y16 ” (16-bit uncompressed greyscale image)
■ Post-Colorize, YCbCrb: The output is transformed to YCbCr color space using the specified color palette (see Section 6.7). Resolution is 640x512 for both the 320 and 640 configurations. Three options are provided, identified via the UVC video format 4CC code:
• I420: 8 bit Y plane followed by 8 bit 2x2 subsampled U and V planes
• NV12: 8-bit Y plane followed by an interleaved U/V plane with 2x2 subsampling
• NV21: same as NV12 except reverse order of U and V planes
I have tried reinstalled everything several times, although it takes a few hours to reflash the TX2 and re-install open CV and Tensorflow. I have tried two different "builds" of opencv. I have tried to view the webcam with cheese and have never had a problem.
I don't work with Python but you need disable conversion to RGB:
cap.set(cv.CAP_PROP_CONVERT_RGB, 0)
See you v4l example from OpenCV.
I was able to find a way to get it to work, using the below code worked. It seemed to be a problem with open CV interacting with the v4l2.
pipeline = "v4l2src device=/dev/video1 ! video/x-raw,width=640,height=512,format=(string)I420,pixel-aspect-ratio=1/1, interlace-mode=(string)progressive, framerate=30/1 ! videoconvert ! appsink"
cap = cv2.VideoCapture(pipeline, cv2.CAP_GSTREAMER)
https://github.com/FLIR/BosonUSB/issues/13
I have a setup with a raspberry pi 3 running latest jessie with all updates installed in which i provide a A2DP bluetooth sink where i connect with a phone to play some music.
Via pulseaudio, the source (phone) is routed to the alsa output (sink). This works reasonably well.
I now want to analyze the audio stream using python3.4 with librosa and i found a promising example using pyaudio which got adjusted to use the pulseaudio input (which magically works because its the default) instead of a wavfile:
"""PyAudio Example: Play a wave file (callback version)."""
import pyaudio
import wave
import time
import sys
import numpy
# instantiate PyAudio (1)
p = pyaudio.PyAudio()
# define callback (2)
def callback(in_data, frame_count, time_info, status):
# convert data to array
data = numpy.fromstring(data, dtype=numpy.float32)
# process data array using librosa
# ...
return (None, pyaudio.paContinue)
# open stream using callback (3)
stream = p.open(format=p.paFloat32,
channels=1,
rate=44100,
input=True,
output=False,
frames_per_buffer=int(44100*10),
stream_callback=callback)
# start the stream (4)
stream.start_stream()
# wait for stream to finish (5)
while stream.is_active():
time.sleep(0.1)
# stop stream (6)
stream.stop_stream()
stream.close()
wf.close()
# close PyAudio (7)
p.terminate()
Now while the data flow works in principle, there is a delay (length of buffer) with which the stream_callback gets called. Since the docs state
Note that PyAudio calls the callback function in a separate thread.
i would have assumed that while the callback is worked on, the buffer keeps filling in the mainthread. Of course, there would be an initial delay to fill the buffer, afterwards i expected to get synchronous flow.
I need a longer portion in the buffer (see frames_in_buffer) for librosa to be able to perfom analysis correctly.
How is something like this possible? Is it a limitation of the software-ports for the raspberry ARM?
I found other answers, but they use the blocking I/O. How would i wrap this into a thread so that librosa analysis (which might take some time) does not block the buffer filling?
This blog seems to fight performance issues with cython, but i dont think the delay is a performance issue. Or might it? Others seem to need some ALSA tweaks but would this help while using pulseaudio?
Thanks, any input appreciated!
I am trying to get pairs of images out of a Minoru stereo webcam, currently through opencv on linux.
It works fine when I force a low resolution:
left = cv2.VideoCapture(0)
left.set(cv2.cv.CV_CAP_PROP_FRAME_WIDTH, 320)
left.set(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT, 240)
right = cv2.VideoCapture(0)
right.set(cv2.cv.CV_CAP_PROP_FRAME_WIDTH, 320)
right.set(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT, 240)
while True:
_, left_img = left.read()
_, right_img = right.read()
...
However, I'm using the images for creating depth maps, and a bigger resolution would be good. But if I try leaving the default, or forcing resolution to 640x480, I'm hitting errors:
libv4l2: error turning on stream: No space left on device
I have read about USB bandwith limitations but:
this happens on the first iteration (first read() from right)
I don't need anywhere near 60 or even 30 FPS, but couldn't manage to reduce "requested FPS" via VideoCapture parameters (if this makes sense)
adding sleeps don't seem to help, even between the left/right reads
strangely if I do much processing (in the while loop), I start noticing "lag": things happening in the real world get shown much later on the images read. This would suggest that actually there is a buffer somewhere that can and does accumulate several images (a lot)
I tried a workaround of creating and releasing a separate VideoCapture for each image read, but this is a bit too slow overall (< 1FPS), and more importantly, image are too much out of sync for working on stereo matching.
I'm trying to understand why this fails, in order to find solutions. It looks like v4l is allocating a single global too-small buffer, used by the 2 capture objects somehow.
Any help would be appreciated.
I had the same problem and found this answer - https://superuser.com/questions/431759/using-multiple-usb-webcams-in-linux
Since both the minoru cameras show the format as 'YUYV', this is likely a USB bandwidth issue. I lowered the frames per second to 20 (didn't work at 24) and I can see both the 640x480 images.
I have two webcam (Logitech C310) connected to beagleboard-xm (Angstrom). I am capturing video from both the webcam and displaying it on two separate widow using OpenCV. I am using two threads (one for each webcam ) to capture and display video from webcams.
The problem is:
Frame rate that I am getting from both cameras is very low (around 3 fps with resolution 640 X 480)
While running the code sometimes it works fine but sometimes gives the following error :
Gtk:ERROR:gtkaccelmap.c:113:_gtk_accel_map_init: assertion failed: (accel_entry_ht == NULL)
Aborted