Audio stream mangement in Linux - linux

I have a very complicated audio setup for a project. Here's what we have:
3 applications playing sound
2 applications recording sound
2 sound cards
I really don't really have the code to any of these applications. All I want to do is monitor and control the audio streams. Here are a few examples of operations I'd like to do while the applications are running:
Mute one of the incoming audio streams.
Have one of the incoming audio streams do a "solo" (be the only stream that can "talk").
Get a graph (about 30 seconds worth) of the audio that each stream produced.
Send one of the audio streams to soundcard #1, but all three audio streams to soundcard #2.
I would likely switch audio streams every 2 minutes or so with one of the operations listed above. A GUI would be preferred. I started looking at the sound systems in Linux and it gets extremely complex and I feel like there have been many new advances in the past few years. I see jack, pulseaudio, artsd, and several other packages. They all have some promise but where should I start? Is there something someone already built that can help?

PulseAudio should be able to let you do all that. You'll need to configure a custom pipeline for splitting the app's audio for task 4, and I'm not exactly certain how you'd accomplish task 3, but I do know that it's capable of all sorts of audio stream handling via its volume control (pavucontrol).

I use Jack, which is quite simple to install and use, even if it
requires more efforts to configure with Flash and Firefox ...
You can try the latest Ubuntu Studio distribution and see if it solves your
problem (for the GUI, look at "patchage").

Related

Easiest way to play mp3 files in python to a specific device

I'm converting an ESP32 project to a Raspberry Pi zero. One of the project behaviors is to play back sound effects based on specific events or triggers. I prefer to use MP3 format so I can store information about the contents of the file in the ID3TAGs to make the files themselves easier to manage. (there are a lot of them!)
I can find examples of using any number of libraries to play mp3s in python, and I found an example of selecting a device using 'sounddevice' but it seems to want numpy arrays to play sound data.
I'm wondering what the easiest and quickest way is to play mp3 files (or should I go to some other file format with a data stub file for each to do my file management?).
Since these behaviors are played as responses, they need to at least start playback quickly (i.e. not wait for a format conversion to take place). And in some cases, other behaviors (such as voice recognition triggers) are already going to add to potential latency on the device in it's total response time.
EDIT: additional info
quickest means processor speed (pi zeros slow down quick under heavy load)
These are real time responses so any 'lag' converting defeats the purpose of the playback.
Also, the device from seeed is configured as an alsa (asound) device

Adding audio effects (reverb etc..) to a BackgroundAudioPlayer driven streaming audio app

I have a windows phone 8 app which plays audio streams from a remote location or local files using the BackgroundAudioPlayer. I now want to be able to add audio effects, for example, reverb or echo, etc...
Please could you advise me on how to do this? I haven't been able to find a way of hooking extra audio processing code into the pipeline of audio processing even through I've read much about WASAPI, XAudio2 and looked at many code examples.
Note that the app is written in C# but, from my previous experience with writing audio processing code, I know that I should be writing the audio code in native C++. Roughly speaking, I need to find a point at which there is an audio buffer containing raw PCM data which I can use as an input for my audio processing code which will then write either back to the same buffer or to another buffer which is read by the next stage of audio processing. There need to be ways of synchronizing what happens in my code with the rest of the phone's audio processing mechanisms and, of course, the process needs to be very fast so as not to cause audio glitches. Or something like that; I'm used to how VST works, not how such things might work in the Windows Phone world.
Looking forward to seeing what you suggest...
Kind regards,
Matt Daley
I need to find a point at which there is an audio buffer containing
raw PCM data
AFAIK there's no such point. This MSDN page hints that audio/video decoding is performed not by the OS, but by the Qualcomm chip itself.
You can use something like Mp3Sharp for decoding. This way the mp3 will be decoded on the CPU by your managed code, you can interfere / process however you like, then feed the PCM into the media stream source. Main downside - battery life: the hardware-provided codecs should be much more power-efficient.

Sync two soundcards

I have a program written in C++ that uses RtAudio ( Directsound ) to capture and playback audio at 48kHz samplerate.
The input capture uses a callback option. The callback writes data to a ringbuffer.
The output is a blocking write function in a separate thread that reads from the ringbuffer.
If the input and output devices are the same the audio loops thru perfectly.
Now I want to get audio from device 1 and playback on device 2. Each device has its own sampleclock set to 48kHz but are not in sync. After a couple of seconds the input and output are out of sync.
Is it possible to sync two independent oudio devices?
There are two challenges you face:
getting the two devices to start at the same time.
getting the two devices to stay in sync.
Both of these tasks are difficult. In the pro audio world, #2 is accomplished with special hardware to sync the word-clocks of multiple devices. It can also be done with a high quality video signal. I believe it can also be done with firewire devices, but I'm not sure how that works. In practice, I have used devices with no sync ("wild") and gotten very reasonable sync for up to an hour or two. Depending on what you are trying to do, the sync should not drift more than a few milliseconds over the course of a few minutes. If it does, you can consider your hardware broken (of course, cheap hardware is often broken).
As for #1, I'm not sure this is possible in any reliable sense with directsound. To the extent that it's possible with any audio API, it is difficult at best: both cards have streams that require some time to setup, open and start playing. In general, the solution is to use an API where this time is super low (ASIO, for example). This works reasonably well for applications like video, but I don't know if it really solves the problem in general.
If you really need to solve this problem, you could open both cards, starting to play silence, and use the timing information generated by the cards to establish the delay between putting data into the card and its eventual playback (this will be different for each card and probably each time you run) and use that data to calculate when to start actual playback. I don't know if RTAudio supplies the necessary timing information, but PortAudio does. This document may help.

Audio Playback control in C++

I'm working on a project that requires me to sync an audio playback(preferably an mp3 file) with my program.
My program reads a motion file from a txt file and output's it onto the serial port at a particular rate. At the same time an audio file has to be played back on the speaker. This audio file has to be in sync with the data..that is to say after say transmittin 100 bytes of data, the audio mustve played back to a predefined time.
What would be the tools used to play and control audio like this?
a tutorial would be great!
Thanks!!
In general, when working with audio, you want to synchronize other sources to audio. This is for several reasons, but most important is that audio runs on a clock running on its own hardware. You'll have to get timing information from that clock. There is a guide here written for using portaudio, but the principles apply to other situations:
http://www.portaudio.com/docs/portaudio_sync_acmc2003.pdf

low latency sounds on key presses

I am trying to write an application(I'm a gui first timer) for my son, he has autism. There is a video player in the top half and a text entry area in the bottom. When letters are typed sounds are produced to mimic the words in the video.
There have been other posts on this site in regard to playing sounds on key presses, using gstreamer as a system call. I have also tried libcanberra but both seem to have significant delays between sounds. I can write the app in python or C but will likely do at least some of it in C.
I also want to mention that the video portion is being played by gstreamer. I tried to create two instances of gstreamer, to avoid expensive system calls but the audio instance seemed to kill the app when called.
If anyone has any tips on creating faster responding sounds I would really appreciate it.
You can upload a raw audio sample directly to PulseAudio so there will be no decoding and (perhaps save) extra switches by using the following function from Canberra:
http://developer.gnome.org/libcanberra/unstable/libcanberra-canberra.html#ca-context-cache
The next ca_context_play() will use it.
However, the biggest problem you'll encounter with this scenario (with simultaneous video playback) is that the audio device might be configured with large latency with PulseAudio (up to 1/2s or more for normal playback). It may be reasonable to file a bug to libcanberra to support a LOW_LATENCY flag, as it currently doesn't attempt to minimize delay for sound events afaik. That would be great to have.
GStreamer pulsesink could probably get low latency too (it has some properties for that), but I am afraid it won't be as lightweight as libcanberra, and you won't be able to cache a sample for instance. Ideally, GStreamer could also learn to cache samples, or pre-fill PulseAudio...

Resources