My machine is running Ubuntu 20 LTS. I want to manipulate the input live audio in real-time. I have achieved pitch shifting using sox. The command being -
sox -t pulseaudio default -t pulseaudio null pitch +1000
and then routing the audio from "Monitor of Nullsink" .
What I actually want to do is, silence randomized parts of the input audio, with a range. What I mean is, randomly mute 1-2s of the input audio.
The final goal of this project will be to write a script that manipulates my voice and makes it seems like my network is bad.
There is no restriction in method of achieving. That is we may use any language, make an extension, directly manipulate the input audio with sox, ffmpeg etc. Anything goes.
Found the solution by using trim in sox. The project can be found in
https://github.com/TathagataRoy1278/Bad_Internet_Audio_Modulator
I have an audio coming from a radio transceiver on my sound card's microphone input. What i want to make is a simple software-based parrot repeater using Linux CLI tools like the sox suite and arecord. For it to work, i think a flow similar to the following must take place:
The audio that comes on the microphone subdevice is getting recorded in a buffer (file or RAM-based)
When the buffer stops filling (audio stopped), start playing it's content on the audio output device (it is connected to the radio's microphone input)
When it's over, empty the buffer and start expecting step 1 to occur again
I'm looking for an elegant way to implement the logic behind step 2. Is there a CLI tool that i can use for that, so i can pipe the microphone audio taken with arecord to it and play the output of the buffer with sox?
Try looking at this. I did this on a raspberry pi a little while ago, only I made a voice changer.
https://www.instructables.com/Halloween-Voice-Changer-With-Raspberry-Pi/
Basically, play "|rec --buffer 2048 -d" takes recorded sound and puts it in a buffer that is passed in 4096 bit (byte?) chunks to play. -d stands for duration, and if left blank defaults to 0, and will run until killed. If you want to play with the options, there is some helpful info in the links.
Good luck with your project!
As much as it matters my scenario is developing an accessibility application not any kind of malicious eavesdropping, whereas also within this scenario there are various research and development implied scenarios, all of which should greatly benefit from being able to read the microphone audio stream by multiple simultaneously running unrelated processes such as recording tools and/or different versions of my own code.
Problem Statement
I am reading a microphone input stream using a high level python API like follows:
import sounddevice
audio_stream = sounddevice.InputStream(
device=self.microphone_device,
channels=max(self.channels),
samplerate=self.audio_props['sample_rate'],
blocksize=int(self.audio_props['frame_elements_size']),
callback=self.audio_callback)
I would like to learn whether it is possible (on linux) to read the microphone audio stream simultaneously to another program such as Google Meet / Zoom reading it. I.e. effectively share the audio stream.
As is with the mentioned python wrapper, it is no big surprise that when the above code is started while a video call is in progress, it will simply fail to open the stream:
Expression 'paInvalidSampleRate' failed in
'src/hostapi/alsa/pa_linux_alsa.c', line: 2043
Expression 'PaAlsaStreamComponent_InitialConfigure( &self->playback, outParams, self->primeBuffers, hwParamsPlayback, &realSr )'
failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2716
Expression 'PaAlsaStream_Configure( stream, inputParameters, outputParameters, sampleRate, framesPerBuffer, &inputLatency, &outputLatency, &hostBufferSizeMode )'
failed in 'src/hostapi/alsa/pa_linux_alsa.c', line: 2837
Admittedly, I am not very well versed with ALSA terminology and in general the sound stack on linux yet.
My question is, can this be accomplished directly using ALSA library API, or otherwise via other sound stacks or sound system configuration? Or if all else is not meant to work, via a proxy program/driver that is able to expose an audio buffer to multiple consumers without incurring noticeable degradation in audio stream latency?
You can do this directly with ALSA. Dsnoop should do the trick. It is a plugin included with ALSA that allows sharing input streams.
From the page I linked above:
dsnoop is the equivalent of the dmix plugin, but for recording sound. The dsnoop plugin allows several applications to record from the same device simultaneously.
From the ALSA docs:
If you want to use multiple input(capture) clients you need to use the dsnoop plugin:
You can poke around there for details on how to use it. This issue on GitHub will also help you get started, it details how to configure the dsnoop interface so you can read from it with pyaudio.
Update
To configure ALSA, edit /etc/asound.conf with something like this (from the ALSA docs on dsnoop):
pcm.mixin {
type dsnoop
ipc_key 5978293 # must be unique for all dmix plugins!!!!
ipc_key_add_uid yes
slave {
pcm "hw:1,0"
channels 2
period_size 1024
buffer_size 4096
rate 44100
periods 0
period_time 0
}
bindings {
1 1
1 0
}
}
You can test to see if your configuration works with something like this:
arecord -d 30 -f cd -t wav -D pcm.mixin test.wav
So, this is more an audio question than a python question I guess. :)
Depending on the API, Streams can be device exclusive or not. ASIO for professional audio for example is often device exclusive, so just one application(like a DAW) has access to it. On Windows for example you can turn this on and off as seen here:
https://help.ableton.com/hc/en-us/articles/209770485-Disabling-exclusive-mode-for-ASIO-interfaces
Most Python packages like pyaudio and so on are just providing bindings for portaudio, which does the heavy lifting, so also have a look at the portaudio documentation. Portaudio "combines" all the different APIs like ASIO,ALSA,WASAPI,Core Audio, and so on.
For ALSA to create more than one Stream at the same time you might need dmix, have a look at this Stackoverflow question:
https://unix.stackexchange.com/questions/355662/alsa-doesnt-work-when-multiple-applications-are-opened
I have phone call recordings which are dual channel with each channel supposed to carry only the voice of one speaker. However, they have some echo of the other channel. Any ways to remove this, in ffmpeg or sox or otherwise.
I am working on a Ubuntu 16.04 environment and using mplayer to play back the audio. A link to a 10s clip of the audio may be found here: https://drive.google.com/file/d/14xrchHvcluhDNGutYfCPpQi3cas_4Ogi/view?usp=sharing
I also looked at (almost) the same question: Silence out quiet periods in audio file with ffmpeg
Not very sure I could follow the answer/comment though.
Thanks!
I'm trying to make a simple "TV viewer" using a Linux DVB video capture card. Currently I watch TV using the following process (I'm on a Raspberry Pi):
Tune to a channel using azap -r TV_CHANNEL_HERE. This will supply bytes to
device /dev/dvb/adapter0/dvr0.
Open OMXPlayer omxplayer /dev/dvb/adapter0/dvr0
Watch TV!
The problem comes when I try to change channels. Even if I set the player to cache incoming bytes (tried with MPlayer also), the player can't withstand a channel change (by restarting azap with a new channel.
I'm thinking this is because of changes in the MPEG TS stream metadata.
Looking for a C library that would let me do the following:
Pull cache_size * mpeg_ts_packet_size from DVR device.
Evaluate each packet and rewrite metadata (PID, etc) as needed.
Populate FIFO with resulting packet.
Set {OMXPlayer,MPlayer} to read from FIFO.
The other thing I was thinking would be to use a program that converts MPEG TS into MPEG PS and concatenate the bytes that way.
Thoughts?
Indeed, when you want to tune on an other channel, some metadata can potentially change and invalid previously cached data.
Unfortunately I'm not familiar with the tools you are using but your point 2. makes me raise an eyebrow: you will waste your time trying to rewrite Transport Stream data.
I would rather suggest to stop and restart process on zapping since it seems to work fine at start.
P.S.:
Here are some tools that can help. Also, I'm not sure at which level your problem is but VLC can be installed on Raspberry PI and it handles TS gracefully.