Description:
I have a Raspberry PI controlling a small vehicle, and it has a RealSense camera attached to it. What I need to do is to send a live stream from the camera to an HTML page/NodeJS server hosted on Google App Engine, so that the user can see the stream on his page. The stream will be used to manually control the vehicle, so low latency is very important.
What I attempted:
My current solution is just a simple socket connection using Socket.IO, I send a frame through the socket, decode it and display it in the page. The problem is that this method is extremely slow, and from what I understood not a good way to stream to a remote client, that is why I need to find a different way.
I tried using uv4l, when I run the line uv4l --driver uvc --device-id "realsense camera id" it says the camera is recognized, but then immediately stops without any error. When I try to open the stream with my IP and click "call" I get the error "invalid input device". Could not find any solution for this problem.
I also thought about using webRTC, I tried to follow this example (which is the closest I found to what I need): https://dev.to/whitphx/python-webrtc-basics-with-aiortc-48id , but it uses a Python server and I want to use my GAE/NodeJS server, and I'm struggling to figure out how to convert this code to use a python client and a NodeJS server.
If anyone can provide some information or advice I'd really appreciate it.
If want to control the vehicle, the latency is extremely important. I think the latency is better if about 100ms, and should not greater than 400ms if the network is jitter for a while.
The latency is introduced by everywhere, from your encoder on Raspberry PI, transfer to media server, and H5 player. Especially the encoder and player.
The best solution is use UDP based protocol like WebRTC:
Raspberry PI PC Chrome H5
Camera --> Encoder ---UDP--> Media Server --UDP---> WebRTC Player
So recommend to use WebRTC to encode and send the frame to media server, and H5 WebRTC player. You could test this solution by replace the encoder with H5 WebRTC publisher, the latency is about 100ms, please see this wiki.The arch is bellow:
Raspberry PI PC Chrome H5
Camera --> WebRTC ---UDP--> Media Server --UDP---> WebRTC Player
Note: The WebRTC stack is complex, so you could build from H5 to H5, test the latency, then move the media server from intranet to internet and test the latency, next replace the H5 publisher by your Raspberry PI and test the latency.
If want to run the solution ASAP, FFmpeg is a better encoder, to encode the frame from camera and package it as RTMP packet, then publish to media server by RTMP, finally play by H5 WebRTC player, please read this wiki. The latency is larger than WebRTC encoder, I think it might be around 600ms, but it should be OK to run the demo. The arch is bellow:
Raspberry PI PC Chrome H5
Camera --> FFmpeg ---RTMP--> Media Server --UDP---> WebRTC Player
or SRT
Note that SRT is also realtime protocol, about 200~500ms latency.
Note that you could also run media server on Raspberry PI, and use WebRTC player to play the stream from it, when they are in the same WiFi. The latency should be the minimum, because it's intranet transport.
I am trying to build a device that will encode h.264 video on a raspberrypi and stream it out to a separate web server in the cloud. The main issue I am having is most implementations I search for either have the web server directly on the pi or have the embedded player playing video directly from the device.
I would like it to be pretty much plug and play no matter what network I am on ie no port forwarding of any sort all I need to do is connect the device to the network and the stream will be visible on a webpage.
One possible solution to the issue is just simply encode frames in base 64 as jpegs and send them to a an endpoint on the webserver, however, this is a huge waste of bandwidth and wont allow for the framerate h.264 would.
Any idea on some possible technologies that could be used to do this?
I feel like it can be done with some websockets or zmq and ffmpeg somehow but I am not sure.
It would be helpful if you could provide more description of the architecture of the device. Since it is an RPI, it is probably also being used for video acquisition via the camera expansion port. If this is the case, you can access the video device and do quite a bit with respect to streaming using the combination of available command line tools.
Something like the following will produce an RTMP stream from the video camera host.
raspivid [preferred options] -o - | ffmpeg -i - [preferred options] rtmp://[IP ADDR]/[location]
From there, FFmpeg will do a lot of heavy lifting for you.
This will now enable remote hosts to access the RTMP stream.
Other tools that would complement that architecture may be ffserver where the rtmp stream from the rpi host could be acquired and then be made available to a variety of clients such as a player in a webpage. Quick look shows ffserver may be obsolete, but that there are analogous components.
I want to connect to camera for zoom options, recording video and taking snapshot. Can this be done?
If your camera supports the ONVIF (Open Network Video Interface) industry standard, have a look at the onvif and node-onvif npm packages. Both packages provide an API to control pan tilt zoom movements. Moreover, they include a discovery function to detect supported cameras on the local network. This way, you can easily find out whether or not your camera is supported by the packages.
https://www.npmjs.com/package/onvif
https://www.npmjs.com/package/node-onvif
https://www.onvif.org/profiles/
i have just started to delve into streaming libraries and the underlying protocols. I understand rtsp/rtp streaming and what these 2 protocols are for. But if we need the ip address, codec and the rtsp/rtp protocols to stream the video and audio from any cameras then why do we have onvif standard which essentially also aims to standardize the communication between IP network devices. I have seen the definitions of onvif so thats not what I am looking for. I want to know why at all we need onvif when we already have rtsp/rtp and what additional benefits it can provide.
ONVIF is much more than just video streaming. It's an attempt to standardize all remote protocols for network communication between security devices. This includes things like PTZ control video analytics and is much more than just digital camera devices.
Songbird is an open source media player with lot of plugins. I want to broadcast media playing on my songbird over a network. Kindly, do not suggest me using another player.
Use pulseaudio as your audio framework - it has support for acting as a streaming server. It can play to both your local speakers and a network destination simultaneously.