Embedded audio conference API - audio

I really don't have knowledge about this area (WebRTC, video conference, audio conference, etc).
I want to add to my system (web application) a client support using audio conference.
I was looking for Twilio, it seems a good solution, but I think it doesn't fit my case, because it always need a virtual phone number to get works and I don't need it at my system.
What I need is something like Facebook calls, Google Hangouts (without video).
Is there any solution/library/API for it? It's no necessary be a free solution.

Related

Stream music from streaming platform (Deezer, Spotify, Soundcloud) to Web Audio API

Do any of you, know a way to get the audio stream of a music platform and plug it to the Web Audio API ?
I am doing a music visualizer based on the Web Audio API. It currently reads sounds from the mic of my computer and process a real-time visualization. If I play music loud enough, my viz works !
But now I'd like to move on and only read the sound coming from my computer, so that the visualization render only to the music and no other sound such as people chatting.
I know I can buffer MP3 file in that API and it would work perfectly. But in 2020, streaming music is very common, via Deezer, Spotify, Souncloud etc.
I know they all have an API but they often offer an SDK where you cannot really do more than "play" music. There is no easy access to the stream of audio data. Maybe I am wrong and that is why I ask your help.
Thanks
The way to stream music to WebAudio is to use a MediaElementAudioSourceNode or MediaStreamAudioSourceNode. However, these nodes will output zero unless you're allowed to access the data. This means you have to set the CORS property correctly on your end and also requires the server to allow the access through CORS.
A google search will help with setting up CORS. But many sites won't allow access unless you have the right permissions. Then you are out of luck.
I find a "no-code" work around. At least on Ubuntu 18.04, I am able to tell Firefox to take my speakers as the "microphone input".
You just have to select the good "mic" in the list when your browser asks for mic permission.
That solution is very convenient since I do not need to write platform-specific binding-code to access to the audio stream

music playback on Google Home

I'm trying to migrate a music player Alexa Skill to Google Home. But I cannot find a pre-built music playback (Actions or DialogFlow)... I want to reproduce streaming music using my own music server (not from Spotify or Google music).
I found a couple of examples using buildRichResponse and/or MediaObject, but these are not exactly a playback service.
Does anyone know if google home has a multimedia playback or a way to do it easily?
Thx
The Assistant's Media response is the nearest parallel to Alexa's AudioPlayer, although there are clearly differences between the two:
Alexa's playback is done outside the context of a session / conversation. So once you start a playback, you only have the playback controls available. Assistant Media controls are part of a conversation, so you can fully handle anything the user might say.
One consequence of this is that Alexa treats the playback as the result of the skill, while the Assistant treats it as part of the Action.
Google only sends an event when media playback has finished, and doesn't give any indication of why it has finished. Alexa reports more of the controls and has more events describing the state of the playback.
This makes it fairly easy to "queue up" the next audio for Alexa, but that brings additional complexity for how to handle when the queue ends up being wrong at the last moment. The Assistant doesn't have any way to queue the next audio, so there ends up being a gap between the audio ending and the next beginning while the event is handled on the server.
Although the approaches are slightly different, both seem to offer a basic long-audio playback service.
This doesn't sound like what you are trying to do, but if you are looking for something slightly more static, you can also look at the content actions that Google supports.
See https://github.com/Limag/aiplayer/ for an example to play self hosted MP3s. Unfortunately, even changing the volume will not be recognized. And it seems there is no way to add this.
If you use Google Play Music, you can upload MP3s with a tiny helper application, provided by Google. Google Play Music works well, but has some other disadvantages. E.g. it is unusable for audio books, all playlists starting always from the beginning.

Access to audio from audio card with WebRTC

I'd like to be able to capture the audio from the audio card of my computer and to dispatch it with WebRTC. However, I am not sure if it's possible or not to have access to the audio directly produced by my computer.
According to this repo https://github.com/niklasenbom/RecordingApp/blob/master/app.js there is a system audio stuff but not sure if it's what I'm looking for.
Thanks,
You can do it by using NAudio. Actually I did the same project myself and will put it in GitHub in a few weeks and update this answer. You can configure the frequency etc. and use it's OnDataAvailable event to dispatch the sound to registered clients.

What libraries/APIs allow me access real time audio waveforms of a phone call?

I am looking to build an app that needs to process incoming audio on a phone call in real time.
WebRTC allows for this but i think this works only in their browser based P2P audio communications functionality but not for phone calls/ VOIP.
Twilio and Plivo allow you record the audio for batch/later processing.
Is there a library that will give me access to the audio streams in real time? If not, what would I need to build such a service from scratch?
Thanks
If you are open to using a media server (so that the call is not longe P2P but it's mediated by the media server using a B2B model), then perhaps the Kurento Media Server may solve your problem. Kurento Media Server makes possible to create processing capabilities which are applyied in real time onto the media streams. There are many examples in the documentation of computer vision and augmented reality algorithms applied in real time over the video streams. I've never seen an only-audio processing module, but it should be simple to implement just by creating an additional module, which is not too complex if you have some knowledge about C/C++ and media processing concepts.
Disclaimer: I'm part of the Kurento development team.

Record audio from various internal devices in Android (via undocumented API)

I was wondering whether it is possible to capture audio data from other sources like the system out, FM radio, bluetooth headset, etc. I'm particularly interested in capturing audio from the FM radio and already investigated all possibilities including trying to sniff the raw bluetooth communication between the phone and the radio device with no luck. It's too bad Android only allows recording audio from the MIC.
I've looked at the Android source code and couldn't find a backdoor to allow me to do that without rooting the device. Do you, at least, have any idea how to use other devices (maybe access somehow /dev/audio) say via NDK or even better - Java (maybe Reflection?) to trick the system to capture the audio stream from say, the FM radio. (in my case I'm trying to develop the app for the HTC Desire)
PS. And for those of you who are against using undocumented APIs, please don't post here - I'm writing an app that will be for my personal use or even if I ever publish it I will warn the user of possible incompatibilities.
I've spent quite some time deciphering the audio stack, and I think you may try to hijack libaudio. You'll have trouble speaking directly to the hardware (/dev/*) because many devices use proprietary audio drivers. There's no rule in this regard.
However, the audio hardware abstraction layer (HAL) provided by /system/lib/libaudio.so should expose the API described at http://source.android.com/porting/audio.html
The Android system, and especially audioflinger, uses this libaudio HAL to find available devices, deal with routing, and of course to read/write PCM data.
So, you could hijack the interaction between audioflinger and libaudio, by renaming the later, and providing your own libaudio which decorates the real one. Doing so, you should be able to log what happens and very possibly intercept FM radio output, provided that this is not directly handled by the hardware.
Of course, all this requires rooting. Please comment if you manage to do this, that interests me.

Resources