Format settings for iOS multimixer au when using a bluetooth endpoint - audio

Hi Core Audio/Au community,
I have hit a roadbloc during development. My current AUGraph is set up as 2 Mono streams->Mixer unit->remoteIO unit on an iOS platform. I am using the mixer to mix two mono stream into stereo interleaved. However, the need is that mono streams neednt be mixed at all times while being played out in stereo i.e the interleaved stereo output should be composed of: the 1st mono stream in the left ear and the 2nd mono stream in the right ear. I am able to accomplish this using the kAudioUnitProperty_MatrixLevels property on the Multichannel mixer.
//Left out //right out
matrixVolumes[0][0]=1; matrixVolumes[0][1]=0.001;
matrixVolumes[1][0]=0.001; matrixVolumes[1][1]=0.001;
result = AudioUnitSetProperty(mAumixer, kAudioUnitProperty_MatrixLevels, kAudioUnitScope_Input, 0,matrixVolumes , matrixPropSize);
if (result) {
printf("Error while setting kAudioUnitProperty_MatrixLevels from mixer on bus 0 %ld %08X %4.4s\n", result, (unsigned int)result, (char*)&result);
return -1;
}
printf("setting matrix levels kAudioUnitProperty_MatrixLevels on bus 1 \n");
//Left out //right out
matrixVolumes[0][0]=0.001; matrixVolumes[0][1]=1;
matrixVolumes[1][0]=0.001; matrixVolumes[1][1]=0.001;
result = AudioUnitSetProperty(mAumixer, kAudioUnitProperty_MatrixLevels, kAudioUnitScope_Input, 1,matrixVolumes , matrixPropSize);
if (result) {
printf("Error while setting kAudioUnitProperty_MatrixLevels from mixer on bus 1 %ld %08X %4.4s\n", result, (unsigned int)result, (char*)&result);
return -1;
}
As shown above I am using the volume controls to control the streams playing as unmixed stereo interleaved separately. This works fine when I am using a wired headset to play the audio; the 1st mono stream plays on the left ear and the second mono stream plays on the right ear. But when I am switching to a bluetooth headset the audio output is a mix of both the mono streams playing on both the left and right channel. So, the matrix levels do not seem to work there. The formats used for the i/p and o/p of the mixer are as follows:
printf("create Input ASBD\n");
// client format audio goes into the mixer
obj->mInputFormat.mFormatID = kAudioFormatLinearPCM;
int sampleSize = ((UInt32)sizeof(AudioSampleType));
obj->mInputFormat.mFormatFlags = kAudioFormatFlagsCanonical;
obj->mInputFormat.mBitsPerChannel = 8 * sampleSize;
obj->mInputFormat.mChannelsPerFrame = 1; //mono
obj->mInputFormat.mFramesPerPacket = 1;
obj->mInputFormat.mBytesPerPacket = obj->mInputFormat.mBytesPerFrame = sampleSize;
obj->mInputFormat.mFormatFlags |= kAudioFormatFlagIsNonInterleaved;
// obj->mInputFormat.mSampleRate = obj->mGlobalSampleRate;(// set later while initializing audioStreamer or from the client app)
printf("create output ASBD\n");
// output format for the mixer unit output bus
obj->mOutputFormat.mFormatID = kAudioFormatLinearPCM;
obj->mOutputFormat.mFormatFlags = kAudioFormatFlagsCanonical | (kAudioUnitSampleFractionBits << kLinearPCMFormatFlagsSampleFractionShift);
obj->mOutputFormat.mChannelsPerFrame = 2;//stereo
obj->mOutputFormat.mFramesPerPacket = 1;
obj->mOutputFormat.mBitsPerChannel = 8 * ((UInt32)sizeof(AudioUnitSampleType));
obj->mOutputFormat.mBytesPerPacket = obj->mOutputFormat.mBytesPerFrame = 2 * ((UInt32)sizeof(AudioUnitSampleType));
// obj->mOutputFormat.mSampleRate = obj->mGlobalSampleRate; (// set later while initializing)
N.B : I am setting the sample rates separately from the application.The i/p and o/p sample rate of the mixer is same as sample rate of the mono audio files.
Thanks in advance for taking a look at the issue...:)

Related

How do I use FFMPEG/libav to access the data in individual audio samples?

The end result is I'm trying to visualise the audio waveform to use in a DAW-like software. So I want to get each sample's value and draw it. With that in mind, I'm currently stumped by trying to gain access to the values stored in each sample. For the time being, I'm just trying to access the value in the first sample - I'll build it into a loop once I have some working code.
I started off by following the code in this example. However, LibAV/FFMPEG has been updated since then, so a lot of the code is deprecated or straight up doesn't work the same anymore.
Based on the example above, I believe the logic is as follows:
get the formatting info of the audio file
get audio stream info from the format
check that the codec required for the stream is an audio codec
get the codec context (I think this is info about the codec) - This is where it gets kinda confusing for me
create an empty packet and frame to use - packets are for holding compressed data and frames are for holding uncompressed data
the format reads the first frame from the audio file into our packet
pass that packet into the codec context to be decoded
pass our frame to the codec context to receive the uncompressed audio data of the first frame
create a buffer to hold the values and try allocating samples to it from our frame
From debugging my code, I can see that step 7 succeeds and the packet that was empty receives some data. In step 8, the frame doesn't receive any data. This is what I need help with. I get that if I get the frame, assuming a stereo audio file, I should have two samples per frame, so really I just need your help to get uncompressed data into the frame.
I've scoured through the documentation for loads of different classes and I'm pretty sure I'm using the right classes and functions to achieve my goal, but evidently not (I'm also using Qt, so I'm using qDebug throughout, and QString to hold the URL for the audio file as path). So without further ado, here's my code:
// Step 1 - get the formatting info of the audio file
AVFormatContext* format = avformat_alloc_context();
if (avformat_open_input(&format, path.toStdString().c_str(), NULL, NULL) != 0) {
qDebug() << "Could not open file " << path;
return -1;
}
// Step 2 - get audio stream info from the format
if (avformat_find_stream_info(format, NULL) < 0) {
qDebug() << "Could not retrieve stream info from file " << path;
return -1;
}
// Step 3 - check that the codec required for the stream is an audio codec
int stream_index =- 1;
for (unsigned int i=0; i<format->nb_streams; i++) {
if (format->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
stream_index = i;
break;
}
}
if (stream_index == -1) {
qDebug() << "Could not retrieve audio stream from file " << path;
return -1;
}
// Step 4 -get the codec context
const AVCodec *codec = avcodec_find_decoder(format->streams[stream_index]->codecpar->codec_id);
AVCodecContext *codecContext = avcodec_alloc_context3(codec);
avcodec_open2(codecContext, codec, NULL);
// Step 5 - create an empty packet and frame to use
AVPacket *packet = av_packet_alloc();
AVFrame *frame = av_frame_alloc();
// Step 6 - the format reads the first frame from the audio file into our packet
av_read_frame(format, packet);
// Step 7 - pass that packet into the codec context to be decoded
avcodec_send_packet(codecContext, packet);
//Step 8 - pass our frame to the codec context to receive the uncompressed audio data of the first frame
avcodec_receive_frame(codecContext, frame);
// Step 9 - create a buffer to hold the values and try allocating samples to it from our frame
double *buffer;
av_samples_alloc((uint8_t**) &buffer, NULL, 1, frame->nb_samples, AV_SAMPLE_FMT_DBL, 0);
qDebug () << "packet: " << &packet;
qDebug() << "frame: " << frame;
qDebug () << "buffer: " << buffer;
For the time being, step 9 is incomplete as you can probably tell. But for now, I need help with step 8. Am I missing a step, using the wrong function, wrong class? Cheers.

How to lower the quality and specs of a wav file on linux

So to preface my problem, I'll give some context.
In SDL2 you can load wav files such as from the wiki:
SDL_AudioSpec wav_spec;
Uint32 wav_length;
Uint8 *wav_buffer;
/* Load the WAV */
if (SDL_LoadWAV("test.wav", &wav_spec, &wav_buffer, &wav_length) == NULL) {
fprintf(stderr, "Could not open test.wav: %s\n", SDL_GetError());
} else {
/* Do stuff with the WAV data, and then... */
SDL_FreeWAV(wav_buffer);
}
The issue I'm getting from SDL_GetError is Complex WAVE files not supported
Now the wav file I'm intending to open has the following properties:
Playing test.wav.
Detected file format: WAV / WAVE (Waveform Audio) (libavformat)
ID_AUDIO_ID=0
[lavf] stream 0: audio (pcm_s24le), -aid 0
Clip info:
encoded_by: Pro Tools
ID_CLIP_INFO_NAME0=encoded_by
ID_CLIP_INFO_VALUE0=Pro Tools
originator_reference:
ID_CLIP_INFO_NAME1=originator_reference
ID_CLIP_INFO_VALUE1=
date: 2016-05-1
ID_CLIP_INFO_NAME2=date
ID_CLIP_INFO_VALUE2=2016-05-1
creation_time: 20:13:34
ID_CLIP_INFO_NAME3=creation_time
ID_CLIP_INFO_VALUE3=20:13:34
time_reference:
ID_CLIP_INFO_NAME4=time_reference
ID_CLIP_INFO_VALUE4=
ID_CLIP_INFO_N=5
Load subtitles in dir/
ID_FILENAME=dir/test.wav
ID_DEMUXER=lavfpref
ID_AUDIO_FORMAT=1
ID_AUDIO_BITRATE=2304000
ID_AUDIO_RATE=48000
ID_AUDIO_NCH=2
ID_START_TIME=0.00
ID_LENGTH=135.53
ID_SEEKABLE=1
ID_CHAPTERS=0
Selected audio codec: Uncompressed PCM [pcm]
AUDIO: 48000 Hz, 2 ch, s24le, 2304.0 kbit/100.00% (ratio: 288000->288000)
ID_AUDIO_BITRATE=2304000
ID_AUDIO_RATE=48000
ID_AUDIO_NCH=2
AO: [pulse] 48000Hz 2ch s16le (2 bytes per sample)
ID_AUDIO_CODEC=pcm
From the wiki.libsdl.org/SDL_OpenAudioDevice page and subsequent wiki.libsdl.org/SDL_AudioSpec#Remarks page I can at least surmise that a wav file of:
freq = 48000;
format = AUDIO_F32;
channels = 2;
samples = 4096;
quality should work.
The main problem I can see is that my wav file has the s16le format whereas it's not listed on the SDL_AudioSpec page.
This leads me to believe I need to reduce the quality of test.wav so it does not appear as "complex" in SDL.
When I search engine Complex WAVE files not supported nothing helpful comes up, except it appears in the SDL_Mixer library, which as far as I know I'm not using.
Can the format be changed via ffmepg to work in SDL2?
Edit: This appears to be the actual code in SDL2 where it complains. I don't really know enough about C to dig all the way through the vast SDL2 library, but I thought it might help if someone notices something just from hinting variable names and such:
/* Read the audio data format chunk */
chunk.data = NULL;
do {
if ( chunk.data != NULL ) {
SDL_free(chunk.data);
chunk.data = NULL;
}
lenread = ReadChunk(src, &chunk);
if ( lenread < 0 ) {
was_error = 1;
goto done;
}
/* 2 Uint32's for chunk header+len, plus the lenread */
headerDiff += lenread + 2 * sizeof(Uint32);
} while ( (chunk.magic == FACT) || (chunk.magic == LIST) );
/* Decode the audio data format */
format = (WaveFMT *)chunk.data;
if ( chunk.magic != FMT ) {
SDL_SetError("Complex WAVE files not supported");
was_error = 1;
goto done;
}
After a couple hours of fun audio converting I got it working, will have to tweak it to try and get better sound quality.
To answer the question at hand, converting can be done by:
ffmpeg -i old.wav -acodec pcm_s16le -ac 1 -ar 16000 new.wav
To find codecs on your version of ffmpeg:
ffmpeg -codecs
This format works with SDL.
Next within SDL when setting the desired SDL_AudioSpec make sure to have the correct settings:
freq = 16000;
format = AUDIO_S16LSB;
channels = 2;
samples = 4096;
Finally the main issue was most likely using the legacy SDL_MixAudio instead of the newer SDL_MixAudioFormat
With the following settings:
SDL_MixAudioFormat(stream, mixData, AUDIO_S16LSB, len, SDL_MIX_MAXVOLUME / 2); as can be found on the wiki.

AAC stream resampled incorrectly

I do have a very particular problem, I wish I could find the answer to.
I'm trying to read an AAC stream from an URL (online streaming radio e.g. live.noroc.tv:8000/radionoroc.aacp) with NAudio library and get IEEE 32 bit floating samples.
Besides that I would like to resample the stream to a particular sample rate and channel count (rate 5512, mono).
Below is the code which accomplishes that:
int tenSecondsOfDownloadedAudio = 5512 * 10;
float[] buffer = new float[tenSecondsOfDownloadedAudio];
using (var reader = new MediaFoundationReader(pathToUrl))
{
var ieeeFloatWaveFormat = WaveFormat.CreateIeeeFloatWaveFormat(5512, 1); // mono
using (var resampler = new MediaFoundationResampler(reader, ieeeFloatWaveFormat))
{
var waveToSampleProvider = new WaveToSampleProvider(resampler);
int readSamples = 0;
int tempBuffer = new float[5512]; // 1 second buffer
while(readSamples <= tenSecondsOfDownloadedAudio)
{
int read = waveToSampleProvider.Read(tempBuffer, 0, tempBuffer.Length);
if(read == 0)
{
Thread.Sleep(500); // allow streaming buffer to get loaded
continue;
}
Array.Copy(tempBuffer, 0, buffer, readSamples, tempBuffer.Length);
readSamples += read;
}
}
}
These particular samples are then written to a Wave audio file using the following simple method:
using (var writer = new WaveFileWriter("path-to-audio-file.wav", WaveFormat.CreateIeeeFloatWaveFormat(5512, 1)))
{
writer.WriteSamples(samples, 0, samples.Length);
}
What I've encountered is that NAudio does not read 10 seconds of audio (as it was requested) but only 5, though the buffer array gets fully loaded with samples (which at this rate and channel count should contain 10 seconds of audio samples).
Thus the final audio file plays the stream 2 times as slower as it should (5 second stream is played as 10).
Is this somewhat related to different bit depths (should I record at 64 bits per sample as opposite to 32).
I do my testing at Windows Server 2008 R2 x64, with MFT codecs installed.
Would really appreciate any suggestions.
The problem seems to be with MediaFoundationReader failing to handle HE-AACv2 in ADTS container with is a standard online radio stream format and most likely the one you are dealing with.
Adobe products have the same problem mistreating this format exactly the same way^ stretching the first half of the audio to the whole duration and : Corrupted AAC files recorded from online stream
Supposedly, it has something to do with HE-AACv2 stereo stream being actually a mono stream with additional info channel for Parametric Stereo.

Generate audio tone to sound card in C++ or C#

I am trying to generate a tone to the sound card (Frequency: 1950 hz, duration: 40 ms, level: -30 db, right-channel, on steam 1). Any recommendations on how to accomplish this using C++ or C#. Are there any libraries (C++ or C#) for generating such precise tone?
David, playing audio to the speakers was built right into .NET (i think in the .NET 2.0 Framework). Using the System.Media.SoundPlayer you can play a sound from a memory stream that you build (in WAV format). Here is a function i coded that plays a simple frequency for a certain duration. Regarding the decibels and sending it to the sound card, i don't really understand what specifics you are referring to. For instance i fail to understand how audio as measured in decibels is sent to the sound card. My understanding is that decibels are simply a measure of how loud a sound is, thus after it's been reproduced by the speakers. Thus the volume control on the speakers affects what decibel your sounds will produce, and sending a certain decibel to the sound card thus makes no sense to me. Maybe you need something more detailed and maybe this doesn't work for you. But maybe you can run with this and get it to work for what you need. And maybe it is almost exactly what you are asking.
The process i use in this code allows one to build any audio you want, and plays it. So you can create 2 sine waves or many, many more, or triangle waves, or even speech synthesis with this method if you want. This method takes sound samples which are calculated and then plays those, so you need to code what each audio sample needs to be at the given moment in time. WAV allows stereo sound too, but this code sample only uses non-stereo sound. If you want stereo sound then it just needs modified to generate the bytes for a stereo WAV format instead. I expect it would not be too difficult.
Happy coding!
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Windows.Forms;
public static void PlayBeep(UInt16 frequency, int msDuration, UInt16 volume = 16383)
{
var mStrm = new MemoryStream();
BinaryWriter writer = new BinaryWriter(mStrm);
const double TAU = 2 * Math.PI;
int formatChunkSize = 16;
int headerSize = 8;
short formatType = 1;
short tracks = 1;
int samplesPerSecond = 44100;
short bitsPerSample = 16;
short frameSize = (short)(tracks * ((bitsPerSample + 7) / 8));
int bytesPerSecond = samplesPerSecond * frameSize;
int waveSize = 4;
int samples = (int)((decimal)samplesPerSecond * msDuration / 1000);
int dataChunkSize = samples * frameSize;
int fileSize = waveSize + headerSize + formatChunkSize + headerSize + dataChunkSize;
// var encoding = new System.Text.UTF8Encoding();
writer.Write(0x46464952); // = encoding.GetBytes("RIFF")
writer.Write(fileSize);
writer.Write(0x45564157); // = encoding.GetBytes("WAVE")
writer.Write(0x20746D66); // = encoding.GetBytes("fmt ")
writer.Write(formatChunkSize);
writer.Write(formatType);
writer.Write(tracks);
writer.Write(samplesPerSecond);
writer.Write(bytesPerSecond);
writer.Write(frameSize);
writer.Write(bitsPerSample);
writer.Write(0x61746164); // = encoding.GetBytes("data")
writer.Write(dataChunkSize);
{
double theta = frequency * TAU / (double)samplesPerSecond;
// 'volume' is UInt16 with range 0 thru Uint16.MaxValue ( = 65 535)
// we need 'amp' to have the range of 0 thru Int16.MaxValue ( = 32 767)
double amp = volume >> 2; // so we simply set amp = volume / 2
for (int step = 0; step < samples; step++)
{
short s = (short)(amp * Math.Sin(theta * (double)step));
writer.Write(s);
}
}
mStrm.Seek(0, SeekOrigin.Begin);
new System.Media.SoundPlayer(mStrm).Play();
writer.Close();
mStrm.Close();
} // public static void PlayBeep(UInt16 frequency, int msDuration, UInt16 volume = 16383)
NAudio provides a robust audio library for .NET.
NAudio is an open source .NET audio and MIDI library, containing dozens of useful audio related classes intended to speed development of audio related utilities in .NET. It has been in development since 2002 and has grown to include a wide variety of features. While some parts of the library are relatively new and incomplete, the more mature features have undergone extensive testing and can be quickly used to add audio capabilities to an existing .NET application. NAudio can be quickly added to your .NET application using NuGet.
Here's an article that walks step-by-step through using NAudio to create a sine wave. You can create the sine wave with any desired frequency, for any desired duration:
http://msdn.microsoft.com/en-us/magazine/ee309883.aspx

LibAV - what approach to take for realtime audio and video capture?

I'm using libav to encode raw RGB24 frames to h264 and muxing it to flv. This works
all fine and I've streamed for more then 48 hours w/o any problems! My next step
is to add audio to the stream. I'll be capturing live audio and I want to encode it
in real time using speex, mp3 or nelly moser.
Background info
I'm new to digital audio and therefore I might be doing things wrong. But basically my application gets a "float" buffer with interleaved audio. This "audioIn" function gets called by the application framework I'm using. The buffer contains 256 samples per channel,
and I have 2 channels. Because I might be mixing terminology, this is how I use the
data:
// input = array with audio samples
// bufferSize = 256
// nChannels = 2
void audioIn(float * input, int bufferSize, int nChannels) {
// convert from float to S16
short* buf = new signed short[bufferSize * 2];
for(int i = 0; i < bufferSize; ++i) { // loop over all samples
int dx = i * 2;
buf[dx + 0] = (float)input[dx + 0] * numeric_limits<short>::max(); // convert frame of the first channel
buf[dx + 1] = (float)input[dx + 1] * numeric_limits<short>::max(); // convert frame of the second channel
}
// add this to the libav wrapper.
av.addAudioFrame((unsigned char*)buf, bufferSize, nChannels);
delete[] buf;
}
Now that I have a buffer, where each sample is 16 bits, I pass this short* buffer, to my
wrapper av.addAudioFrame() function. In this function I create a buffer, before I encode
the audio. From what I read, the AVCodecContext of the audio encoder sets the frame_size. This frame_size must match the number of samples in the buffer when calling avcodec_encode_audio2(). Why I think this, is because of what is documented here.
Then, especially the line:
If it is not set, frame->nb_samples must be equal to avctx->frame_size for all frames except the last.*(Please correct me here if I'm wrong about this).
After encoding I call av_interleaved_write_frame() to actually write the frame.
When I use mp3 as codec my application runs for about 1-2 minutes and then my server, which is receiving the video/audio stream (flv, tcp), disconnects with a message "Frame too large: 14485504". This message is generated because the rtmp-server is getting a frame which is way to big. And this is probably due to the fact I'm not interleaving correctly with libav.
Questions:
There quite some bits I'm not sure of, even when going through the source code of libav and therefore I hope if someone has an working example of encoding audio which comes from a buffer which which comes from "outside" libav (i.e. your own application). i.e. how do you create a buffer which is large enough for the encoder? How do you make the "realtime" streaming work when you need to wait on this buffer to fill up?
As I wrote above I need to keep track of a buffer before I can encode. Does someone else has some code which does this? I'm using AVAudioFifo now. The functions which encodes the audio and fills/read the buffer is here too: https://gist.github.com/62f717bbaa69ac7196be
I compiled with --enable-debug=3 and disable optimizations, but I'm not seeing any
debug information. How can I make libav more verbose?
Thanks!

Resources