SDL2 won't play with more than 6 audio channels - audio

I am trying to stream (raw) video and audio from a capture device as part of my home media setup (with my pc acting similarly to a receiver in a typical home theatre setup), but the biggest problem I haven't been able to get past is that I haven't been able to get ffplay (using SDL2 as its audio backend) to work with all 8 channels in 7.1 streams - two simply get dropped, despite it recognising 8 channel input or me specifying a 7.1 layout.
I have been able to confirm that all 8 channels are present in the source by first using ffmpeg to save the output of a speaker test to a file and playing that back with both mplayer (which works) and ffplay (which doesn't). I also wrote some minimal code to play the audio directly through SDL's API with the same result, so it's not the fault of ffplay. I might simply use mplayer if it weren't for the fact that piping output from ffmpeg adds too much latency for real-time use. I am using libSDL version 2.0.12 and ffplay 4.2.3, both of which are the latest at the time of writing and are ostensibly supposed to support 7.1 audio.
Using output recorded from speaker-test -c 8, I am using the following to play it back in mplayer:
mplayer -channels 8 -rawaudio channels=8 -format s16le -demuxer rawaudio speaker-test.pcm
and the following to play it back in ffplay:
ffplay -f s16le -ac 8 -af 'channelmap=channel_layout=7.1' speaker-test.pcm
No matter what I try, the two side channels get dropped. I couldn't figure out how to play raw pcm in SDL, so I repeated the same tests with wav output and used the following code to play it back:
#include <SDL2/SDL.h>
int main(int argc, char **argv) {
SDL_Init(SDL_INIT_AUDIO);
SDL_AudioSpec wavSpec;
Uint32 wavLength;
Uint8 *wavBuffer;
SDL_LoadWAV("speaker-test.wav", &wavSpec, &wavBuffer, &wavLength);
SDL_AudioDeviceID deviceID = SDL_OpenAudioDevice(NULL, 0, &wavSpec, NULL, 0);
SDL_QueueAudio(deviceID, wavBuffer, wavLength);
SDL_PauseAudioDevice(deviceID, 0);
SDL_Delay(30000);
SDL_CloseAudioDevice(deviceID);
SDL_FreeWAV(wavBuffer);
SDL_Quit();
return 0;
}
The above code exhibits the same behaviour of dropping the two additional side channels, despite it being the latest version of SDL that should have supported 7.1 for many releases now. Why might this be happening, and how might I fix it?

Related

FFMPEG: Properly sidechain_compress stereo background with stereo sidechain into stereo output

I'm doing voiceover and since Sony Vegas does not support sidechaining, I render voiceover into voices.wav and then use sidechain_compress filter, as per ffmpeg documentation:
ffmpeg -y -i background.m4a -i voices.wav -filter_complex \
"[1:a]asplit=2[sc][mix];\
[0:a][sc]sidechaincompress=threshold=0.015:ratio=2:level_sc=0.8:release=500:attack=1[compr];\
[compr][mix]amerge" sidechain_1.wav
voices.wav is a stereo audio file, as well as background.m4a. But here's how the result file looks like when loaded into Sony Vegas:
This shows that in channels 1/2 I get the compressed background, while in channel 3 and 4 I get two mono tracks that somehow differ (probably, that's the original voices input and somewhat altered voices input, both in mono). UPD: I don't want to further process resulting tracks in Sony Vegas, I'd prefer ffmpeg to be the last step in my production process. The screenshot above is for illustration purposes only.
Is the background gets sidechain compressed with only left or right channel of voices? If so, how to change that to make it compressed by both channels (some voices are panned into left or right, so there might be actual difference in compressed result)
What are those channels 3 and 4? Why are they mono?
How do I get single 1/2 stereo track in the output wav file instead of this weird 4 channels in 3 tracks? (I've looked at pan complex filter, but didn't figure out how to set it up in my case).
amerge adds the channels of the inputs. amix uses the channel count of the input with the most channels. So, switch to amix.
ffmpeg -y -i background.m4a -i voices.wav -filter_complex \
"[1:a]asplit=2[sc][mix];\
[0:a][sc]sidechaincompress=threshold=0.015:ratio=2:level_sc=0.8:release=500:attack=1[compr];\
[compr][mix]amix" sidechain_1.wav

Removal of low-noise echo from stereo phone call recordings

I have phone call recordings which are dual channel with each channel supposed to carry only the voice of one speaker. However, they have some echo of the other channel. Any ways to remove this, in ffmpeg or sox or otherwise.
I am working on a Ubuntu 16.04 environment and using mplayer to play back the audio. A link to a 10s clip of the audio may be found here: https://drive.google.com/file/d/14xrchHvcluhDNGutYfCPpQi3cas_4Ogi/view?usp=sharing
I also looked at (almost) the same question: Silence out quiet periods in audio file with ffmpeg
Not very sure I could follow the answer/comment though.
Thanks!

mkv file out of sync with linear drift

I have a bunch of mkv files, with FLAC as the audio codec and FFV1 as the video one.
The files were created using an EasyCap aquisition dongle from a VCR analog source. Specifically, I used VLC's "open acquisition device" prompt and selected PAL. Then, I converted the files (audio PCM, video raw YUV) to (FLAC, FFV1) using
ffmpeg.exe -i input.avi -acodec flac -vcodec ffv1 -level 3 -threads 4 -coder 1 -context 1 -g 1 -slices 24 -slicecrc 1 output.mkv
Now, the files are progressively out of sync. It may be due to the fact that while (maybe) the video has a constant framerate, the FLAC track has variable framerate. So, is there a way to sync the track to audio, or something alike? Can FFmpeg do this? Thanks
EDIT
On Mulvya hint, I plotted the difference in sync at various times; the first column shows the seconds elapsed, the second shows the difference - in secs. The plot seems to behave linearly, with 0.0078 as a constant slope. NOTE: measurements taken by hands, by means of a chronometer
EDIT 2
Playing around with VirtualDub, I found that changing the framerate to 25 fps from the original 24.889 (Video->Frame rate...->Change frame rate to) and using the track converted to wav definitely does work. Two problems, though: VirtualDub crashes when importing the original FFV1-FLAC mkv file, so I had to convert the video to H264 to try it out; more, I find it difficult to use an external encoder to save VirtualDub output.
So, could I avoid using VirtualDub, and simply use ffmpeg for it? Here's the exported vdscript:
VirtualDub.audio.SetSource("E:\\4_track2.wav", "");
VirtualDub.audio.SetMode(0);
VirtualDub.audio.SetInterleave(1,500,1,0,0);
VirtualDub.audio.SetClipMode(1,1);
VirtualDub.audio.SetEditMode(1);
VirtualDub.audio.SetConversion(0,0,0,0,0);
VirtualDub.audio.SetVolume();
VirtualDub.audio.SetCompression();
VirtualDub.audio.EnableFilterGraph(0);
VirtualDub.video.SetInputFormat(0);
VirtualDub.video.SetOutputFormat(7);
VirtualDub.video.SetMode(3);
VirtualDub.video.SetSmartRendering(0);
VirtualDub.video.SetPreserveEmptyFrames(0);
VirtualDub.video.SetFrameRate2(25,1,1);
VirtualDub.video.SetIVTC(0, 0, 0, 0);
VirtualDub.video.SetCompression();
VirtualDub.video.filters.Clear();
VirtualDub.audio.filters.Clear();
The first line imports the wav-converted audio track.
Can I set an equivalent pipe in ffmpeg (possibly, using FLAC - not wav)? SetFrameRate2 is maybe the key, here.

Is it possible to capture audio from an ASIO device with ffmpeg?

We have a setup with a Windows 7 machine where we installed Dante Virtual Soundcard and start that soundcard with ASIO capabilities. The soundcard will receive audio over the network from a Tesira server. We want to capture the audio to files (highly preferring each channel to a separate file). The files will be played back on a later moment. There will likely be 6 channels or more.
In the same setup we use ffmpeg to capture some video which is working fine, with Direct Show. So for audio we wanted to use the same setup, since ffmpeg is able to record audio as well. However, there seems to be no option to select the ASIO devices which the virtual soundcard probably creates. So the question is what command line to use for ffmpeg, or what to install? Or which other program can record ASIO from command line?
I already tried installing:
Asio4all (actually wrong way around)
sox (don't know why actually)
HiFi Cable Asio Bridge (from VB-audio, not enough channels even with donate version)
Voicemeeter (from VB-Audio, not enough channels and actually mixes down)
O Deus Asio link, this might be an interesting option but it did not let me configure any route, any suggestions?
One thing I noticed is that the virtual soundcard can also be set to use WDM. Then I can see the devices with ffmpeg -list_devices true -f dshow -i duymmy, but recording does not yield any result, I have to ctrl-c to make it stop instead of q, and the file is zero bytes. Supposedly this is because the data over the network is all ASIO formatted and the Tesira Server cannot send "WDM data". FFmpeg stops at selecting the capture pin for audio only
EDIT:
I ran ffmpeg with high verbosity and when selecting the WDM soundcard it stops at Selecting pin Capture on audio only. Also when requesting the options it gives the same line for 22 times: min ch=1 bits=8 rate= 11025 max ch=2 bits=16 rate= 44100
You might use Voicemeeter instead of HIFI-Cable / ASIO-Bridge. Voicemeeter is a virtual audio device mixer able to connect everything together, any audio point, in any interface and any app together (including ASIO DAW)... Download & User Manual on www.voicemeeter.com
To answer my own question: it is not possible to capture sound from an ASIO device with ffmpeg. Maybe I will write the code for it if I need it...
I could however solve my issues by separating the two streams of audio data we have (AVB and Dante). These where on the same switch and maybe it is a bug in the firmware, maybe misconfiguration.
Thanks for your help!
How do I get the output from an ASIO device to IceCast2 or FFMpeg?
Duplicate?
And if not, Place the output for ffmpeg -f dshow -i "audio=your_device_name_in_dshow" -list_options

Correct way to encode Kinect audio with lame.exe

I receive data from a Kinect v2, which is (I believe, information is hard to find) 16kHz mono audio in 32-bit floating point PCM. The data arrives in up to 4 "SubFrames", which contain 256 samples each.
When I send this data to lame.exe with -r -s 16 --bitwidth 32 -m m I get an output containing gaps (supposedly where the second channel should be). These command line switches should however take stereo and downmix it to mono.
I've also tried importing the raw data into Audacity, but I still can't figure out the correct way to get continuous audio out of it.
EDIT: I can get continuous audio when I only save the first SubFrame. The audio still doesn't sound right though.
In the end I went with Ogg Vorbis. A free format, so no problems there either. I use the following command line switches for oggenc2.exe:
oggenc2.exe --raw-format=3 --raw-chan=1 --raw-rate=16000 - --output=[filename]

Resources