Sharing a microphone audio stream on Linux

Sharing a microphone audio stream on Linux - audio

As much as it matters my scenario is developing an accessibility application not any kind of malicious eavesdropping, whereas also within this scenario there are various research and development implied scenarios, all of which should greatly benefit from being able to read the microphone audio stream by multiple simultaneously running unrelated processes such as recording tools and/or different versions of my own code.
Problem Statement
I am reading a microphone input stream using a high level python API like follows:
import sounddevice
audio_stream = sounddevice.InputStream(
device=self.microphone_device,
channels=max(self.channels),
samplerate=self.audio_props['sample_rate'],
blocksize=int(self.audio_props['frame_elements_size']),
callback=self.audio_callback)
I would like to learn whether it is possible (on linux) to read the microphone audio stream simultaneously to another program such as Google Meet / Zoom reading it. I.e. effectively share the audio stream.
As is with the mentioned python wrapper, it is no big surprise that when the above code is started while a video call is in progress, it will simply fail to open the stream:
Expression 'paInvalidSampleRate' failed in
'src/hostapi/alsa/pa_linux_alsa.c', line: 2043
Expression 'PaAlsaStreamComponent_InitialConfigure( &self->playback, outParams, self->primeBuffers, hwParamsPlayback, &realSr )'
failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2716
Expression 'PaAlsaStream_Configure( stream, inputParameters, outputParameters, sampleRate, framesPerBuffer, &inputLatency, &outputLatency, &hostBufferSizeMode )'
failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2837
Admittedly, I am not very well versed with ALSA terminology and in general the sound stack on linux yet.
My question is, can this be accomplished directly using ALSA library API, or otherwise via other sound stacks or sound system configuration? Or if all else is not meant to work, via a proxy program/driver that is able to expose an audio buffer to multiple consumers without incurring noticeable degradation in audio stream latency?

You can do this directly with ALSA. Dsnoop should do the trick. It is a plugin included with ALSA that allows sharing input streams.
From the page I linked above:
dsnoop is the equivalent of the dmix plugin, but for recording sound. The dsnoop plugin allows several applications to record from the same device simultaneously.
From the ALSA docs:
If you want to use multiple input(capture) clients you need to use the dsnoop plugin:
You can poke around there for details on how to use it. This issue on GitHub will also help you get started, it details how to configure the dsnoop interface so you can read from it with pyaudio.
Update
To configure ALSA, edit /etc/asound.conf with something like this (from the ALSA docs on dsnoop):
pcm.mixin {
type dsnoop
ipc_key 5978293 # must be unique for all dmix plugins!!!!
ipc_key_add_uid yes
slave {
pcm "hw:1,0"
channels 2
period_size 1024
buffer_size 4096
rate 44100
periods 0
period_time 0
}
bindings {
1 1
1 0
}
}
You can test to see if your configuration works with something like this:
arecord -d 30 -f cd -t wav -D pcm.mixin test.wav

So, this is more an audio question than a python question I guess. :)
Depending on the API, Streams can be device exclusive or not. ASIO for professional audio for example is often device exclusive, so just one application(like a DAW) has access to it. On Windows for example you can turn this on and off as seen here:
https://help.ableton.com/hc/en-us/articles/209770485-Disabling-exclusive-mode-for-ASIO-interfaces
Most Python packages like pyaudio and so on are just providing bindings for portaudio, which does the heavy lifting, so also have a look at the portaudio documentation. Portaudio "combines" all the different APIs like ASIO,ALSA,WASAPI,Core Audio, and so on.
For ALSA to create more than one Stream at the same time you might need dmix, have a look at this Stackoverflow question:
https://unix.stackexchange.com/questions/355662/alsa-doesnt-work-when-multiple-applications-are-opened

Related

Is it possible to capture audio from an ASIO device with ffmpeg?

We have a setup with a Windows 7 machine where we installed Dante Virtual Soundcard and start that soundcard with ASIO capabilities. The soundcard will receive audio over the network from a Tesira server. We want to capture the audio to files (highly preferring each channel to a separate file). The files will be played back on a later moment. There will likely be 6 channels or more.
In the same setup we use ffmpeg to capture some video which is working fine, with Direct Show. So for audio we wanted to use the same setup, since ffmpeg is able to record audio as well. However, there seems to be no option to select the ASIO devices which the virtual soundcard probably creates. So the question is what command line to use for ffmpeg, or what to install? Or which other program can record ASIO from command line?
I already tried installing:
Asio4all (actually wrong way around)
sox (don't know why actually)
HiFi Cable Asio Bridge (from VB-audio, not enough channels even with donate version)
Voicemeeter (from VB-Audio, not enough channels and actually mixes down)
O Deus Asio link, this might be an interesting option but it did not let me configure any route, any suggestions?
One thing I noticed is that the virtual soundcard can also be set to use WDM. Then I can see the devices with ffmpeg -list_devices true -f dshow -i duymmy, but recording does not yield any result, I have to ctrl-c to make it stop instead of q, and the file is zero bytes. Supposedly this is because the data over the network is all ASIO formatted and the Tesira Server cannot send "WDM data". FFmpeg stops at selecting the capture pin for audio only
EDIT:
I ran ffmpeg with high verbosity and when selecting the WDM soundcard it stops at Selecting pin Capture on audio only. Also when requesting the options it gives the same line for 22 times: min ch=1 bits=8 rate= 11025 max ch=2 bits=16 rate= 44100

You might use Voicemeeter instead of HIFI-Cable / ASIO-Bridge. Voicemeeter is a virtual audio device mixer able to connect everything together, any audio point, in any interface and any app together (including ASIO DAW)... Download & User Manual on www.voicemeeter.com

To answer my own question: it is not possible to capture sound from an ASIO device with ffmpeg. Maybe I will write the code for it if I need it...
I could however solve my issues by separating the two streams of audio data we have (AVB and Dante). These where on the same switch and maybe it is a bug in the firmware, maybe misconfiguration.
Thanks for your help!

How do I get the output from an ASIO device to IceCast2 or FFMpeg?
Duplicate?
And if not, Place the output for ffmpeg -f dshow -i "audio=your_device_name_in_dshow" -list_options

How can I concatenate ATSC streams from DVB card?

I'm trying to make a simple "TV viewer" using a Linux DVB video capture card. Currently I watch TV using the following process (I'm on a Raspberry Pi):
Tune to a channel using azap -r TV_CHANNEL_HERE. This will supply bytes to
device /dev/dvb/adapter0/dvr0.
Open OMXPlayer omxplayer /dev/dvb/adapter0/dvr0
Watch TV!
The problem comes when I try to change channels. Even if I set the player to cache incoming bytes (tried with MPlayer also), the player can't withstand a channel change (by restarting azap with a new channel.
I'm thinking this is because of changes in the MPEG TS stream metadata.
Looking for a C library that would let me do the following:
Pull cache_size * mpeg_ts_packet_size from DVR device.
Evaluate each packet and rewrite metadata (PID, etc) as needed.
Populate FIFO with resulting packet.
Set {OMXPlayer,MPlayer} to read from FIFO.
The other thing I was thinking would be to use a program that converts MPEG TS into MPEG PS and concatenate the bytes that way.
Thoughts?

Indeed, when you want to tune on an other channel, some metadata can potentially change and invalid previously cached data.
Unfortunately I'm not familiar with the tools you are using but your point 2. makes me raise an eyebrow: you will waste your time trying to rewrite Transport Stream data.
I would rather suggest to stop and restart process on zapping since it seems to work fine at start.
P.S.:
Here are some tools that can help. Also, I'm not sure at which level your problem is but VLC can be installed on Raspberry PI and it handles TS gracefully.

Simulate Microphone (virtual mic)

I've got a problem where I need to "simulate" microphone output.
Data will be coming over the network, decoded into PCM and basically needs to be written into the mic - which then other programs can read/record/whatever.
I've been reading up on alsa but information is pretty sparse. The file plugin seemes promising - I was thinking of having a named pipe as "infile" which I could then deliver data to from my application. I can't get it to work however (vlc/audacity just segfault).
pcm.testing {
type file
slave {
pcm {
type hw
card 0
device 0
}
}
infile "/dev/urandom"
format "raw"
}
Are there any better ways of doing this? Any suggestions on alsa plug-ins (particularly the file plugin)?

Your sound will come over the network and what would cache it until something wants to read? Or would data be discarded?
In general something like the below (only barely tested) should work as a virtual mic, but I think that it will always read file from beginning when device opened and you need to check how does it handle end of file. Perhaps what you would try it using pipes but then caching/discarding incoming data needs to be handled by the app reading from network.
pcm.virtmic {
type file
format "raw"
slave.pcm "default"
file '/dev/null'
infile '/dev/urandom'
}
See alsa docs for more options.
Again, not sure if this tool is what you really need for the task. It would have been really nifty if you could start a command with the 'infile' option, like you can with 'file' but unfortunately you can't...
Hope that helps.
UPDATE: slave.pcm must not be "null" but some real device. It seems that is used for timing or I don't know but using null causes the recorder process to block forever. This device could force you at a given sample rate though so be careful. Using "default" is a sane default value. infile needs to provide a raw sound data with the correct/matching format and rate. btw you can look at alsa server and jackd and other sound systems and libraries for alternative solutions for your task

Audio stream mangement in Linux

I have a very complicated audio setup for a project. Here's what we have:
3 applications playing sound
2 applications recording sound
2 sound cards
I really don't really have the code to any of these applications. All I want to do is monitor and control the audio streams. Here are a few examples of operations I'd like to do while the applications are running:
Mute one of the incoming audio streams.
Have one of the incoming audio streams do a "solo" (be the only stream that can "talk").
Get a graph (about 30 seconds worth) of the audio that each stream produced.
Send one of the audio streams to soundcard #1, but all three audio streams to soundcard #2.
I would likely switch audio streams every 2 minutes or so with one of the operations listed above. A GUI would be preferred. I started looking at the sound systems in Linux and it gets extremely complex and I feel like there have been many new advances in the past few years. I see jack, pulseaudio, artsd, and several other packages. They all have some promise but where should I start? Is there something someone already built that can help?

PulseAudio should be able to let you do all that. You'll need to configure a custom pipeline for splitting the app's audio for task 4, and I'm not exactly certain how you'd accomplish task 3, but I do know that it's capable of all sorts of audio stream handling via its volume control (pavucontrol).

I use Jack, which is quite simple to install and use, even if it
requires more efforts to configure with Flash and Firefox ...
You can try the latest Ubuntu Studio distribution and see if it solves your
problem (for the GUI, look at "patchage").

Can v4l2 be used to read audio and video from the same device?

I have a capture card that captures SDI video with embedded audio. I have source code for a Linux driver, which I am trying to enhance to add video4linux2 support. My changes are based on the vivi example.
The problem I've come up against is that all the example I can find deal with only video or only audio. Even on the client side, everything seems to assume v4l is just video, like ffmpeg's libavdevice.
Do I need to have my driver create two separate devices, a v4l2 device and an alsa device? It seems like this makes the job of keeping audio and video in sync much more difficult.
I would prefer some way for each buffer passed between the driver and the app (through v4l2's mmap interface) contain a frame, plus some audio that matches up (with respect to time) with that frame.
Or perhaps have each buffer contain a flag indicating if it is a video frame, or a chunk of audio. Then the time stamps on the buffers could be used to sync things up.
But I don't see a way to do this with the V4L2 API spec, nor do I see any examples of v4l2-enabled apps (gstreamer, ffmpeg, transcode, etc) reading both audio and video from a single device.

Generally, the audio capture part of a device shows up as a separate device. It's usually a different physical device (posibly sharing a card), which makes sense. I'm not sure how much help that is, but it's how all of the software I'm familiar with works...

There are some spare or reserved fields in the v4l2 buffers that can be used to pass audio or other data from the driver to the calling application via pointers to mmaped buffers.
I modified the BT8x8 driver to use this approach to pass data from an A/D card synchronized to the video on Ubuntu 6.06.
It worked OK, but the effort of maintaining my modified driver caused me to abandon this approach.
If you are still interested I could dig out the details.
IF you want your driver to play with gstreamer etc. a separate audio device generally is what is expected.
Most of the cheap v4l2 capture card's audio is only an analog pass through with a volume control requiring a jumper to capture the audio via the sound card's line input.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string