How to use `getUserMedia()` api to simulate WebRTC like behaviour? - browser

My primary intention is to setup a VoIP session between 2 users A & B; Here the raw audio / video media bytes are fetched from A's browser are played in B's browser and vice versa.
The reason is that, when the user C & D are added into this call, we need not have to create a P2P mesh network which limits the performance.
Tried recording media with getUserMedia() and playback, but it is not real time. It also gives a bad user experience. (However, haven't experimented yet with videos of small chunks as 200 ms)
Is there any approach where I can get the raw bytes of the media and play it on other browser? Currently I have a server in between which can connect to both peers if required.
Any online examples or libraries are welcome.
Have already asked 2 questions in this regard with 100-100 bounties, but not much of use:
How to use libsrtp or similar library to decrypt/encrypt the WebRTC data stream?
How to integrate part of WebRTC as a static / dynamic library with the existing C++ code?
Related: How to stream, live video playing on my browser to browser of another user?

If i understand you well is you're looking on how to have more than two users on the session right? without using mesh topology
thats possible and configurable as well by means that some maybe active speaker or everyone is active speaker not only receiver whatever configuration you choose but to me it seems that you're asking for video conferencing
there are couple of tools for this the best one i might recommend is mediasoup its a SFU as selective fowarding unit mediasoup

I don't know if I understand correctly, but it is not likely that you will get raw video data and play it on the browser, it will just kill your bandwith and performance because the raw data is huge.
You need to use the compressed data ( media codec ex.H264 ) and you need a protocol to send and receive it. If you are looking for sub-second latency than webrtc is your best choice in here already. If you have a server in between, distribute your media through that server instead of Mesh. Check this out for webrtc network topologies:
https://antmedia.io/webrtc-servers/

Related

Web Audio live streaming

There is an audio stream which sends from mobile device to the server. And server sends chunks of data (due web-sockets) to the web.
The question is. What to use to play this audio in live mode, also there is should be a possibility to rewind audio back, listen to what was before..and again switch to live mode.
I considered such possibilities as Media Source API but it's not supported by Safari and Chrome on IOS, isn't it? But we need that support.
Also, there is Web Audio API which supports by modern browsers, but I'm not sure does it possible to listen to audio in live mode and rewind audio back?
Any ideas or guides on how to implement it?
I considered such possibilities as Media Source API but it's not supported by Safari and Chrome on IOS, isn't it? But we need that support.
Then, you can't use MediaSource Extensions. Thanks Apple!
And server sends chunks of data (due web-sockets) to the web.
Without MediaSource Extensions, you have no way of using this data from a web socket connection. (Unless it's PCM, or you're decoding it to PCM, in which case you could use the Web Audio API, but this is totally impractical, inefficient, and not something you should pursue.)
You have to change how you're streaming. You have a few choices:
Best Option: HLS
If you switch to HLS, you'll get the compatibility you need, as well as the ability to go back in time and what not. This is what you should do.
Mediocre Option: HTTP Progressive
This is a fine way to stream for most use cases but there isn't any built-in way to handle the stream seeking that you want. You'd have to build it, which is not worth your time since you could just use HLS.
Even More Mediocre Option: WebRTC
You could switch to WebRTC for streaming, but you have greatly increased infrastructure costs and complexity. And, you still need to figure out how you're going to handle seeking. The only reason you'd want to go the WebRTC route is if you absolutely needed the lowest latency.

WebRTC - how to synchronize media streams

I'm using WebRTC in a sort of non-conventional way.
I have multiple streams generated by several 'broadcasting' peers being sent to a collection of several 'receiving' peer.
I intend to use an SFU media server (maybe Jitsi or Kurento)
It is very critical that these streams are presented at the receiving peers in a synchronized fashion.
What are the methods I can use for synchronization? Usually this isn't an issue with WebRTC because there is not usually a consistent clock between peers, but in my case there is a common clock for all the stream sources.
The only ways I can imagine doing it are:
Not worry about it and hope that WebRTC's low latency will cause everything to be in sync.
Somehow encoding timestamp metadata in the WebRTC stream frames, and somehow synchronizing display with javascript in the browser.
Using a tool like GStreamer that can perform video synchronization, mix the streams into a single stream and forward that to the media server (and thus to the receiving clients). I don't have a good idea of how I would actually perform the synchronization though.
Any thoughts and advice would be appreciated.
The only OTT system capable of synchronisation of low latency streams available (when writing this text), is the SYE system made by Net Insight. They are able to synchronise any device down to single digit millisecond in low latency mode.
They do not provide any open source that I know of but you can check it out by downloading a app that uses it.
Primetime
The game starts 20:00 CET every day, download it on several phones/tablets to verify the sync part.
However there are other systems that can synchronise playback that I found.
HibbTV
HibbTV seams to focus on more IPTV replacement solutions as I interpret the solution. They do not seam to target the wild west of internet. I might be wrong please correct me then.
W3C MULTI-DEVICE TIMING COMMUNITY GROUP
Spoke to the researchers a while back. They can synchronise playback but they target collaborative viewing. The low latency part is not part of the scope as I understand it.
Then when it comes to WebRTC, LHLS, MPEG-DASH CMAF and all other solutions they have no sense of time so it will not be possible to render the same video frame on different devices using various access technologies such as 4G, WiFi or cable or even if the devices uses the same technology because the rendering is buffer controlled not time controlled.
/Anders

WebRTC 5 person conference with recording for playbacks?

I am working on a project for large group broadcasting in WebRTC since it needs to work on iOS and Android devices, I am using Kurento, and iOSWEBRTC cordvoa plugin to build this I am curious if anyone can help improve my plan, or if there is a easier way to achieve this.
We need to have a video/audio conference with 5 people per room, however we need to be able to show that video to large audiences. Now my idea would be use Kurento as a middle-man and capture the streams into .webm files for live playback as the conference is going on.
Is there a better way to achieve this? And how would I playback the webm file as it is being recorded, it needs to update and continue playing as more video is sent, basically a live stream copy of the camera.
I am unsure if I am going the best route but I figured that would reduce the bandwidth from my original idea, I originally was thinking of making it like this:
5 person conference for broadcasters X number of viewers then downloaded those streams however I realize the upload bandwidth requirement would be crazy high, that is why I settled on this idea. Additionally the viewers do not have to see real time like the broadcasters. They need to be able to see and communicate with each other at the same time and the viewers can be a few seconds behind.
TL;DR:
Trying to make a 5 person video conference with video/audio capturing to then live stream it to viewers players. This would allow avoiding of PeerConnection bandwidth limitations. Would this work or am I forgetting something?
You'll need to look into using an SFU or MCU. An MCU is very costly, but multiplexes video streams and sends down a single video stream to all peers, and can also record that stream. An SFU is a single point of receipt of all streams, and selectively forwards them to clients. It could record off individual streams and then you could do post-processing to make a single recording out of the multiple recorded streams. A mesh network of connections really doesn't work for this use case.

What libraries/APIs allow me access real time audio waveforms of a phone call?

I am looking to build an app that needs to process incoming audio on a phone call in real time.
WebRTC allows for this but i think this works only in their browser based P2P audio communications functionality but not for phone calls/ VOIP.
Twilio and Plivo allow you record the audio for batch/later processing.
Is there a library that will give me access to the audio streams in real time? If not, what would I need to build such a service from scratch?
Thanks
If you are open to using a media server (so that the call is not longe P2P but it's mediated by the media server using a B2B model), then perhaps the Kurento Media Server may solve your problem. Kurento Media Server makes possible to create processing capabilities which are applyied in real time onto the media streams. There are many examples in the documentation of computer vision and augmented reality algorithms applied in real time over the video streams. I've never seen an only-audio processing module, but it should be simple to implement just by creating an additional module, which is not too complex if you have some knowledge about C/C++ and media processing concepts.
Disclaimer: I'm part of the Kurento development team.

Live media streaming involving different kinds of devices

I am working on a project which will involve http live media streaming from a variety of devices like android phones/tablets, iphone, ipad, browser,etc. It will be a 2 way communication for all the devices with multiple devices connected to a conversation. I have implemented it partially i.e. one way by capturing audio from android phone(native app) and streaming to a web browser(HTML5 app) with a PHP server using ffmpeg and cvlc. I wanted to know of the best way to go ahead about it. Like, if there are any standards to be followed. Also what kind of a server should I be using? I don't want to use any streaming servers like Red5. I would like to implement the streaming logic similar to Http LiveStreaming by apple. I have come across MPEG-DASH that seems to be a standard for http streaming. I still have to look deeper into it. I was also thinking of using NodeJS for its popularity with streaming. Another worry was how do I go about capturing of media from devices? As in, should I use the native capability of the devices to convert media into an mp4 or any container that it supports and then stream it to the server or capture audio and images for a particular period of time and then send it to server and create a common output(I am not really sure of this idea). The separate capture is basically for simplifying the process of video streaming from the server end to any device. I was also thinking if I could completely bypass the server in any cases like a phone to phone or phone to tablet connection.
I just wanted to be sure of the things I will be using/implementing so that I wouldn't have to make drastic changes later on. Any help is deeply appreciated. Thank you.

Resources