How to fix image problems when streaming h.264 via gstreamer udpsink

How to fix image problems when streaming h.264 via gstreamer udpsink - linux

Using gstreamer I want to stream images from several Logitech C920 webcams to a Janus media server in RTP/h.264 format. The webcams produce h.264 encoded video streams, so I can send the streams to a UDP sink without re-encoding data, only payloading it.
I'm using the gst-interpipe plugin to switch between the different webcams, so that the video stream received by Janus stays the same, but with images coming from whatever webcam I choose.
It works but I'm experiencing some problems with broken frames where the colors are gray and details are blurred away, mainly the first 5 - 10 seconds after I switch between webcam source streams. After that the images correct themselves.
First frames
After 5 - 10 seconds or more
First I thought it was a gst-interpipe specific problem, but I can reproduce it by simply setting up two pipelines - one sending a video stream to a UDP sink and one reading from a UDP source:
gst-launch-1.0 -v -e v4l2src device=/dev/video0 ! queue ! video/x-
h264,width=1280,height=720,framerate=30/1 ! rtph264pay
config-interval=1 ! udpsink host=127.0.0.1 port=8004
gst-launch-1.0 -v udpsrc port=8004 caps = "application/x-rtp,
media=video, clock-rate=90000, encoding-name=H264, payload=96" !
rtph264depay ! decodebin ! videoconvert ! xvimagesink
NB: I'm not experiencing this problem if I send the video stream directly to an xvimagesink, i.e. when not using UDP streaming.
Am I missing some important parameters in my pipelines? Is this a buffering issue? I really have no idea how to correct this.
Any help is greatly appreciated.

Due to the nature of temporal dependencies of video streams you cannot just tune in into stream and expect it to be decode-able immediately. Correct decoding can only start at Random-Access-Point frames (e.g. I- or IDR-frames). before that you will get image data that rely on video frames you haven't received - so they will look broken. Some decoders offers some control on what to do on these cases. libavdec_h264 for example has a output-corrupt option. (But actually I don't how it behaves for "correct" frames which just are missing reference frames). Or they may have options to skip everything until a RAP-frame occurs. This depends on your specific decoder implementation. Note however that on any of these options the initial delay before you will see any image will increase.

Related

GStreamer pipeline of 2 wav files onto single RTSP with 2 channels

I'm trying to build a pipeline which I'll give him 2 wav files and stream those 2 as a single RTP, which has 2 channels that each channel is composed of the relative wav file.
I want to send the RTP using RTSP as well in order to do authentication for the RTSP connection.
I've tried using this pipeline
gst-launch-1.0 interleave name=i ! audioconvert ! wavenc ! filesink location=file.wav filesrc location=first_audio_file.wav ! decodebin ! audioconvert ! "audio/x-raw,channels=1,channel-mask=(bitmask)0x1" ! queue ! i.sink_0 filesrc location=second_audio_file.wav ! decodebin ! audioconvert ! "audio/x-raw,channels=1,channel-mask=(bitmask)0x2" ! queue ! i.sink_1
Which helps to take 2 wav files and saves them as a new file.wav in the same directory.
The output of file.wav is the mixing between them and as well he has 2 channels.
I've tried manipulating this pipeline in order to achieve what I've described but the main issue is making the sink to be RTSP with RTP split to 2 channels.
If anyone has a suggestion to solve this, that would be great!
Thanks :)

RTSP is not a streaming transport protocol but a session protocol, so it's completely different from the actual streaming logic (which you can implement with a GStreamer pipeline). That's also why you have a rtpsink (which you can use to stream to RTP), but not an rtspsink for example.
To get a working RTSP server, you can use for example gst-rtsp-server, of which you can find multiple example to set it up in their repo, like this small example. Although the examples are all in C, GStreamer also provides bindings to other languages like Python, Javascript, ...

Mix multiple audio streams into one playback-sound using Gstreamer

I want to use Gstreamer to receive audio streams from multiple points on the same port.
Indeed I want to stream audio from different nodes on the network to one device that listen to incoming audio streams, and it should mix multiple audios before playback.
I know that I should use audiomixer or liveadder to do such a task.
But I can't do it, and the mixer doesn't act correctly and when two audio streams came, the output sound would be so noisy and corrupted.
I used the following command :
gst-launch-1.0.exe -v udpsrc port=5001 caps="application/x-rtp" !
queue ! rtppcmudepay ! mulawdec ! audiomixer name=mix mix. !
audioconvert ! audioresample ! autoaudiosink
but it doesn't work.

Packets on a same port couldn't demux from each other as normal way you wrote in your command, to receive multiple audio streams from same port you should use SSRC and rtpssrcdemux demux.
However to receive multiple audio streams on multiple ports and mix them, you could use liveadder element. An example to receive two audio streams from two ports and mix them is as follows:
gst-launch-1.0 -v udpsrc name=src5001 caps="application/x-rtp"
port=5001 ! rtppcmudepay ! mulawdec ! audioresample ! liveadder
name=m_adder ! alsasink device=hw:0,0 udpsrc name=src5002
caps="application/x-rtp" port=5002 ! rtppcmudepay ! mulawdec !
audioresample ! m_adder.

First, you probably want to use audiomixer over liveadder as the first guarantees synchronization of the different audio streams.
Then, about your mixing problem, you mention that the output sound is "noisy and corrupted", which makes me think of problem with audio levels. Though audiomixer clips the output audio to the maximum allowed amplitude range, it can result in audio artefacts if your sources are too loud. Thus, you might want to play with the volume property on both sources. See here and there for more information.

Sync audio and video when playing mp4 file with GStreamer

I need to sync video and audio when I play mp4 file. How can I do that?
Here's my pipeline:
gst-launch-0.10 filesrc location=./big_buck_bunny.mp4 ! \
qtdemux name=demux demux.video_00 ! queue ! TIViddec2 engineName=codecServer codecName=h264dec ! ffmpegcolorspace !tidisplaysink2 video-standard=pal display-output=composite \
demux.audio_00 ! queue max-size-buffers=500 max-size-time=0 max-size-bytes=0 ! TIAuddec1 ! audioconvert ! audioresample ! autoaudiosink

Have you tried playing the video on a regular desktop without using TI's elements? GStreamer should take care of synchronization for playback cases (and many others).
If the video is perfectly synchronized on a desktop then you have a bug on the elements specific to your target platform (TIViddec2 and tidisplaysink2). qtdemux should already put the expected timestamps on the buffers, so it is possible that TIViddec2 isn't copying those to its decoded buffers or tidisplaysink2 isn't respecting them. (The same might apply to the audio part)
I'd first check TIViddec2 by replacing the rest of the pipeline after it with a fakesink and run with verbose mode of gst-launch. The output from fakesink should show you the output timestamps, check if those are consistent, you can also put a fakesink right after qtdemux to check the timestamps that it produces and see if the decoders are respecting that.

I used wrong video framerate actually.

Syncing audio and video when mp4muxing in gst-launch-1.0

I have a Logitech C920 webcam that provides properly formatted h264 video, and a mic hooked up to an ASUS Xonar external USB sound card. I can read both and mux their data into a single file like this:
gst-launch-1.0 -e \
mp4mux name=muxy ! filesink location=/tmp/out.mp4 \
alsasrc device='hw:Device,0' do-timestamp=true ! audio/x-raw,rate=48000 ! audioconvert ! queue ! lamemp3enc ! muxy.audio_0 \
v4l2src do-timestamp=true ! video/x-h264,framerate=30/1,height=720 ! h264parse ! queue ! muxy.video_0
...but then I get poorly synchronized audio/video. The audio flow consistently starts with 250ms of garbage noise, and the resulting mp4 video is 250ms (7 or 8 frames at 30fps) out of sync.
Seems like the sources start simultaneously, but the sound card inserts 250ms of initialization junk every time. Or perhaps, the camera takes 250ms longer to start up but reports an incorrect start of stream flag. Or, maybe the clocks in my devices are out of sync for some reason. I don't know how to figure out the difference between these (and other) potential root causes.
Whatever the cause, I'd like to patch over the symptoms at least. I've been trying to do any of the following in the gstreamer pipeline, any of which would satisfy my requirements:
Cut out the first 250ms of audio
Delay the video by 250ms or 7 frames
Synchronize the audio and video timestamps properly with attributes like alsasrc slave-method or v4l2src io-mode
And I'm apparently doing it wrong. Nothing works. No matter what, I always end up with the video running 250ms/7 frames ahead of the audio. Adding the queue elements reportedly fixed the sync issue as mediainfo now reports Duration values for audio and wideo within 20ms of each other, which would be acceptable. But that's not how the resulting videos actually work. Clap my hands, the noise arrives late.
This can be fixed in post processing but why not avoid the hassle and get it right, straight from the gst pipeline? I'm all out of tricks and just about ready to fall back to fixing every single video's sync by hand instead. Any ideas out there?
Thanks for any help, tips, ideas.

Gstreamer: RTP jitter buffer not working properly with packet loss?

For a VoIP speech quality monitoring application I need to compare an incoming RTP audio stream to a reference signal. For the signal comparison itself I use pre-existing, special-purpose tools. For the other parts (except packet capture) the Gstreamer library seemed to be a good choice. I use the following pipeline to simulate a bare-bones VoIP client:
filesrc location=foobar.pcap ! pcapparse ! "application/x-rtp, payload=0, clock-rate=8000"
! gstrtpjitterbuffer ! rtppcmudepay ! mulawdec ! audioconvert
! audioresample ! wavenc ! filesink location=foobar.wav
The pcap file contains a single RTP media stream. I crafted a capture file that's missing 50 of the original 400 UDP datagrams. For the given audio sample (8s long for my example):
[XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
with a certain amount of consecutive packet loss I'd expect an audio signal like this to be output ('-' denotes silence):
[XXXXXXXXXXXXXXXXXXXXXXXX-----XXXXXXXXXXX]
however what is actually saved in the audio file is this (1s shorter for my example):
[XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
It seems that the jitter buffer, a crucial part for this application, is not working properly. Could this be an incompatibility with / a shortcoming of the pcapparse element? Am I missing a key part in the pipeline to ensure time synchronization? What else could be causing this?

The issue could be solved by slightly altering the pipeline:
An audiorate element needs to be added before wavenc that "produces a perfect stream by inserting or dropping samples as needed".
However this works only if audiorate receives the packet-loss events. For this the do-lost property of gstjitterbuffer needs to be set to true.
Here's the corrected pipeline:
filesrc location=foobar.pcap ! pcapparse
! "application/x-rtp, payload=0, clock-rate=8000"
! gstrtpjitterbuffer do-lost=true ! rtppcmudepay ! mulawdec
! audioconvert ! audioresample ! audiorate ! wavenc
! filesink location=foobar.wav

GStreamer may just use the dejitter buffer to smooth out the packets on the way to the (audio) output. This wouldn't be unusual, its the bare minimum definition of dejittering.
It may go so far as reordering out-of-order packets or deleting duplicates, but packet loss concealment (your scenario) can be quite complex.
Basic implementations just duplicate the last received packet, whilst more advanced implementation analyse and reconstruct the tone of the last received packets to smooth out the audio.
It sounds like your application performance will depend on the exact implementation of loss concealment, so even if GStreamer does do "something", you may have a hard time quantifying its impact on your results unless you understand it in great detail.
Perhaps you could try a pcap with a couple of out-of-order and duplicate packets and check if GStreamer at least reorders/deletes them, that would go someway to clarifying what is happening.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string