Webrtc audio quality drops after few seconds - audio

I just started learning webrtc and I have a problem. The audio quality (I don't care about video) drops after few seconds. At the beginning the quality is perfect but then it drops. I'm in a private network where the only running thing is a raspberry which has an usb audiocard attached and a stream server running, a small switch and a PC where I listen to the stream that comes out of the raspberry.
I tried to modify the sdp string (by setting some parameters such as the bandwidth) without success.
Does anybody have an idea?
Thank you very much in advance,
David.

The default settings will give you mono audio around 42 kb/s, as it's primarily designed for voice. Here is how to increase the audio quality:
Disable autoGainControl, echoCancellation and noiseSuppression in the getUserMedia() constraints:
navigator.mediaDevices.getUserMedia({
audio: {
autoGainControl: false,
channelCount: 2,
echoCancellation: false,
latency: 0,
noiseSuppression: false,
sampleRate: 48000,
sampleSize: 16,
volume: 1.0
}
});
Add the stereo and maxaveragebitrate attributes to the SDP:
let answer = await peer.conn.createAnswer(offerOptions);
answer.sdp = answer.sdp.replace('useinbandfec=1', 'useinbandfec=1; stereo=1; maxaveragebitrate=510000');
await peer.conn.setLocalDescription(answer);
This should output a string which looks like this:
a=fmtp:111 minptime=10;useinbandfec=1; stereo=1; maxaveragebitrate=510000
This gives a potential maximum bitrate of 520kb/s for stereo, which is 260kps per channel. Actual bitrate depends on the speed of your network and strength of your signal.
You can read more about the other available attributes at: https://www.rfc-editor.org/rfc/rfc7587

Related

How to capture raw audio samples after every 10MS using FFmpeg programmatically?

I am using FFmpeg to record audio using pulse audio at a 48k sampling rate and 32k bitrate. Eventually, I perform encoding these recorded samples through opus which is configured on default settings such as frame_size set to 120, initial padding set to 120 and etc. I am not sure about the recording callback settings for pulse audio but I have read in opus RFC that it uses 120ms by default and can support 10, 20, 40, 60, and any other which is a multiple of 2.5.
I have a project where I have added WebRTC p2p connection support to send FFmpeg recorded audio/video on the audio/video channel of WebRTC to the next connected peer. In the case of sending plain audio as well as plain video which is recorded through FFmpeg, there is no issue with sending these over an audio/video channel (obviously) however, in the case of FFmpeg encoded data, I have to modify WebRTC audio/video encoder factories (clear to me).
I want to capture raw audio samples using FFmpeg after every 10MS callback.

Problem understanding audio stream number of samples when decoded with ffmpeg

The two streams I am decoding are an audio stream (adts AAC, 1 channel, 44100, 8-bit, 128bps) and a video stream (H264) which are received in an Mpeg-Ts stream, but I noticed something that doesn't make sense to me when I decode the AAC audio frames and try to line up the audio/video stream timestamps. I'm decoding the PTS for each video and audio frame, however I only get a PTS in the audio stream every 7 frames.
When I decode a single audio frame I get back 1024 samples, always. The frame rate is 30fps, so I see 30 frames each with 1024 samples which comes equals 30,720 samples and not the expected 44,100 samples. This is a problem when computing the timeline as the timestamps on the frames are slightly different between the audio and video streams. It's very close, but since I compute the timestamps via (1024 samples * 1,000 / 44,100 * 10,000 ticks) it's never going to line up exactly with the 30fps video.
Am I doing something wrong here with decoding the ffmpeg audio frames, or misunderstanding audio samples?
And in my particular application, these timestamps are critical as I am trying to line up LTC timestamps which are decoded at the audio frame level, and lining those up with video frames.
FFProbe.exe:
Video:
r_frame_rate=30/1
avg_frame_rate=30/1
codec_time_base=1/60
time_base=1/90000
start_pts=7560698279
start_time=84007.758656
Audio:
r_frame_rate=0/0
avg_frame_rate=0/0
codec_time_base=1/44100
time_base=1/90000
start_pts=7560686278
start_time=84007.625311

Live streaming: node-media-server + Dash.js configured for real-time low latency

We're working on an app that enables live monitoring of your back yard.
Each client has a camera connected to the internet, streaming to our public node.js server.
I'm trying to use node-media-server to publish an MPEG-DASH (or HLS) stream to be available for our app clients, on different networks, bandwidths and resolutions around the world.
Our goal is to get as close as possible to live "real-time" so you can monitor what happens in your backyard instantly.
The technical flow already accomplished is:
ffmpeg process on our server processes the incoming camera stream (separate child process for each camera) and publishes the stream via RTSP on the local machine for node-media-server to use as an 'input' (we are also saving segmented files, generating thumbnails, etc.). the ffmpeg command responsible for that is:
-c:v libx264 -preset ultrafast -tune zerolatency -b:v 900k -f flv rtmp://127.0.0.1:1935/live/office
node-media-server is running with what I found as the default configuration for 'live-streaming'
private NMS_CONFIG = {
server: {
secret: 'thisisnotmyrealsecret',
},
rtmp_server: {
rtmp: {
port: 1935,
chunk_size: 60000,
gop_cache: false,
ping: 60,
ping_timeout: 30,
},
http: {
port: 8888,
mediaroot: './server/media',
allow_origin: '*',
},
trans: {
ffmpeg: '/usr/bin/ffmpeg',
tasks: [
{
app: 'live',
hls: true,
hlsFlags: '[hls_time=2:hls_list_size=3:hls_flags=delete_segments]',
dash: true,
dashFlags: '[f=dash:window_size=3:extra_window_size=5]',
},
],
},
},
};
As I understand it, out of the box NMS (node-media-server) publishes the input stream it gets in multiple output formats: flv, mpeg-dash, hls.
with all sorts of online players for these formats I'm able to access and the stream using the url on localhost. with mpeg-dash and hls I'm getting anything between 10-15 seconds of delay, and more.
My goal now is to implement a local client-side mpeg-dash player, using dash.js and configure it to be as close as possible to live.
my code for that is:
<!doctype html>
<html>
<head>
<title>Dash.js Rocks</title>
<style>
video {
width: 640px;
height: 480px;
}
</style>
</head>
<body>
<div>
<video autoplay="" id="videoPlayer" controls=""></video>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/dashjs/3.0.2/dash.all.min.js"></script>
<script>
(function(){
// var url = "https://dash.akamaized.net/envivio/EnvivioDash3/manifest.mpd";
var url = "http://localhost:8888/live/office/index.mpd";
var player = dashjs.MediaPlayer().create();
// config
targetLatency = 2.0; // Lowering this value will lower latency but may decrease the player's ability to build a stable buffer.
minDrift = 0.05; // Minimum latency deviation allowed before activating catch-up mechanism.
catchupPlaybackRate = 0.5; // Maximum catch-up rate, as a percentage, for low latency live streams.
stableBuffer = 2; // The time that the internal buffer target will be set to post startup/seeks (NOT top quality).
bufferAtTopQuality = 2; // The time that the internal buffer target will be set to once playing the top quality.
player.updateSettings({
'streaming': {
'liveDelay': 2,
'liveCatchUpMinDrift': 0.05,
'liveCatchUpPlaybackRate': 0.5,
'stableBufferTime': 2,
'bufferTimeAtTopQuality': 2,
'bufferTimeAtTopQualityLongForm': 2,
'bufferToKeep': 2,
'bufferAheadToKeep': 2,
'lowLatencyEnabled': true,
'fastSwitchEnabled': true,
'abr': {
'limitBitrateByPortal': true
},
}
});
console.log(player.getSettings());
setInterval(() => {
console.log('Live latency= ', player.getCurrentLiveLatency());
console.log('Buffer length= ', player.getBufferLength('video'));
}, 3000);
player.initialize(document.querySelector("#videoPlayer"), url, true);
})();
</script>
</body>
</html>
with the online test video (https://dash.akamaized.net/envivio/EnvivioDash3/manifest.mpd) I see that the live latency value is close to 2 secs (but I have no way to actually confirm it. it's a video file streamed. in my office I have a camera so I can actually compare latency between real-life and the stream I get).
however when working locally with my NMS, it seems this value does not want to go below 20-25 seconds.
Am I doing something wrong? any configuration on the player (client-side html) I'm forgetting?
or is there a missing configuration I should add on the server side (NMS) ?
HLS and MPEG DASH are not particularly low latency as standard and the figures you are getting are not unusual.
Some examples from a publicly available DASH forum document (linked below) include:
Given the resources of some of these organisations, the results you have achieved are not bad!
There is quite a focus in the streaming industry at this time on enabling lower latency, the target being to come as close as possible to traditional broadcast latency.
One key component of the latency in chunked Adaptive Bit Rate (ABR, see this answer for more info: https://stackoverflow.com/a/42365034/334402 ) is the need for the player to receive and decode one or more segments of the video before it can display it. Traditionally the player had to receive the entire segment before it could start to decode and display it. The diagram from the first linked open source reference below illustrates this:
Low latency DASH and HLS leverage CMAF, 'Common Media Application Format' which breaks the segments, which might be 6 seconds long for example, into smaller 'chunks' within each segment. These chunks are designed to allow the player to decode and start playing them before it has received the full segment.
Other sources of latency in a typical live stream will be any transcoding from one format to another and any delay in a streaming server receiving the feed, from the webcam in your case, and encoding and packaging it for streaming.
There is quite a lot of good information available on low latency streaming at this time both from standards bodies and open source discussions which I think will really help you appreciate the issues (all links current at time of writing). From open source and standards discussions:
https://dashif.org/docs/Report%20on%20Low%20Latency%20DASH.pdf (DASH focus)
https://github.com/video-dev/hlsjs-rfcs/pull/1. (HLS focus)
and from vendors:
https://bitmovin.com/cmaf-low-latency-streaming/
https://websites.fraunhofer.de/video-dev/dash-js-low-latency-streaming-with-cmaf/
https://aws.amazon.com/blogs/media/alhls-apple-low-latency-http-live-streaming-explained/
Note - a common use case often quoted in the broadcast world is the case where someone watching a live event like a game may hear their neighbours celebrating a goal or touchdown before they see it themselves, because their feed has higher latency than their neighbours. While this is a driver for low latency, this is really a synchronisation issue which would require other solutions if a 'perfectly' synchronised solution was the goal.
As you can see low latency streaming is not a simple challenge and it may be that you want to consider other approaches depending on the details of your use case, including how many subscribers you have, whether some loss of quality if a fair trade off for lower latency etc. As mentioned by #user1390208 in the comments, a more real time focused video communication technology like WebRTC may be a better match for the solution you are targeting.
If you want to provide a service that provides life streaming and also a recording, you may want to consider using a real time protocol for the live streaming view and HLS/DASH streaming for anyone looking back through recordings where latency may not be important but quality may be more key.
Have worked on similar tasks - to get as much close to real-time as possible.
During research found that DASH is not quite the fastest way to achieve real-time translation. With some major tune-ups on the partner's mediaserver-side and FFmpeg we could get a 6 seconds delay. That's quite enough for us with the user-viewing part of a system. But for translation moderators and admins we want to be more real-time.
But there is another solution and we use it. Websockets/WebRTC.
We mostly use the Wowza streaming engine as an enterprise solution and with some tune-ups, we have achieved a 2-second WebRTC delay.
But in case of NMS there is a way to get websocket-flv streaming too just out of the box.
Just for testing, with the use of their player solution at their github site, I've achieved just 4-4.5 seconds of delay right now.
Maybe this information will be useful.

Invalid sample rate While recording an audio

I have two microphones Blue microphone - snow ball ice and Stage line microphone (EMG-500P, DMS-1 microphone base, MPR-1 pre-amp, soundcard as shown in the image below).
I am using a python sound device library example from git and i working fine. Audio recording from Blue microphone is working fine for many different sampling rate like 16000, 44100, 48000 etc (-r = 16000/44100/48000) but Stage line microphone is recording audio only when sampling rate are 44100 or 48000 Hz. Any sampling rate other than 44100/48000 Hz its throwing an error.
PortAudioError: Error opening InputStream: Invalid sample rate
[PaErrorCode -9997]
How can i record at 16KHz from Stage line microphone ?
Why am I not able to sample at 16kHz ? Is it because of sound card ?
I am using sound device python example exact code to record both from Blue microphone and Stage line microphone.
Thanks.

Some streams doesn't play on iOS AVPlayer.

I have a weird problem with playing some streams. Those streams don't play, and AVPlayerItem returns the error "Cannot Decode". By inspecting it on different software (ie VLC Media Player), I noticed that the problem is with audio, mean that all streams with audio bitrate of 128 kb/s and higher are not playable. Is it possible to convert this bitrate "on the fly" to lower one, or is there another approach? I use AVplayerLayer, AVPlayer and AVPlayerItem to play streams.

Resources